Spaces:

anchorxia
/

MuseVSpace

Runtime error

App Files Files Community

anchorxia commited on Apr 11

Commit

a57c6eb

•

1 Parent(s): f7d3f4d

add mmcm

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

MuseV/MMCM/.gitignore +139 -0
MuseV/MMCM/Dockerfile +83 -0
MuseV/MMCM/README.md +2 -0
MuseV/MMCM/mmcm/__init__.py +6 -0
MuseV/MMCM/mmcm/audio/__init__.py +0 -0
MuseV/MMCM/mmcm/data/__init__.py +9 -0
MuseV/MMCM/mmcm/data/clip.py +324 -0
MuseV/MMCM/mmcm/data/clip/__init__.py +5 -0
MuseV/MMCM/mmcm/data/clip/clip.py +197 -0
MuseV/MMCM/mmcm/data/clip/clip_filter.py +46 -0
MuseV/MMCM/mmcm/data/clip/clip_fusion.py +64 -0
MuseV/MMCM/mmcm/data/clip/clip_process.py +366 -0
MuseV/MMCM/mmcm/data/clip/clip_stat.py +13 -0
MuseV/MMCM/mmcm/data/clip/clipid.py +70 -0
MuseV/MMCM/mmcm/data/crawl/__init__.py +0 -0
MuseV/MMCM/mmcm/data/crawl/download.py +72 -0
MuseV/MMCM/mmcm/data/crawl/error.py +20 -0
MuseV/MMCM/mmcm/data/crawl/ffmpeg.py +39 -0
MuseV/MMCM/mmcm/data/crawl/flicker.py +22 -0
MuseV/MMCM/mmcm/data/crawl/youtube.py +13 -0
MuseV/MMCM/mmcm/data/emb/__init__.py +2 -0
MuseV/MMCM/mmcm/data/emb/emb.py +104 -0
MuseV/MMCM/mmcm/data/emb/h5py_emb.py +119 -0
MuseV/MMCM/mmcm/data/emb/json_emb.py +0 -0
MuseV/MMCM/mmcm/data/emb/numpy_emb.py +0 -0
MuseV/MMCM/mmcm/data/extract_feature/__init__.py +0 -0
MuseV/MMCM/mmcm/data/extract_feature/base_extract_feature.py +28 -0
MuseV/MMCM/mmcm/data/general/__init__.py +1 -0
MuseV/MMCM/mmcm/data/general/items.py +69 -0
MuseV/MMCM/mmcm/data/media_map/__init__.py +1 -0
MuseV/MMCM/mmcm/data/media_map/media_map.py +393 -0
MuseV/MMCM/mmcm/data/media_map/media_map_process.py +72 -0
MuseV/MMCM/mmcm/music/__init__.py +6 -0
MuseV/MMCM/mmcm/music/music_map/__init__.py +0 -0
MuseV/MMCM/mmcm/music/music_map/beat_map.py +82 -0
MuseV/MMCM/mmcm/music/music_map/clip_process.py +196 -0
MuseV/MMCM/mmcm/music/music_map/convert_type.py +57 -0
MuseV/MMCM/mmcm/music/music_map/load_music_map.py +38 -0
MuseV/MMCM/mmcm/music/music_map/lyric_map.py +149 -0
MuseV/MMCM/mmcm/music/music_map/lyric_process.py +515 -0
MuseV/MMCM/mmcm/music/music_map/meta_info.py +21 -0
MuseV/MMCM/mmcm/music/music_map/mss_map.py +185 -0
MuseV/MMCM/mmcm/music/music_map/music_clip.py +83 -0
MuseV/MMCM/mmcm/music/music_map/music_map.py +140 -0
MuseV/MMCM/mmcm/music/music_map/music_map_demp.py +58 -0
MuseV/MMCM/mmcm/music/utils/__init__.py +0 -0
MuseV/MMCM/mmcm/music/utils/path_util.py +9 -0
MuseV/MMCM/mmcm/t2p/.gitignore +158 -0
MuseV/MMCM/mmcm/t2p/GPT_eval_multi.py +121 -0
MuseV/MMCM/mmcm/t2p/LICENSE +201 -0

MuseV/MMCM/.gitignore ADDED Viewed

	@@ -0,0 +1,139 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+# Usually these files are written by a python script from a template
+# before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+# However, in case of collaboration, if having platform-specific dependencies or dependencies
+# having no cross-platform support, pipenv may install dependencies that don’t work, or not
+# install all needed dependencies.
+#Pipfile.lock
+# celery beat schedule file
+celerybeat-schedule
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+*.swp
+.*.swp
+dataset/files
+experiments
+log
+csvs
+.idea
+.vscode
+__pycache__/
+*.code-workspace
+.DS_Store
+third_party/
+.polaris_cache/
+*.lock

MuseV/MMCM/Dockerfile ADDED Viewed

	@@ -0,0 +1,83 @@

+# FROM mirrors.tencent.com/todacc/venus-std-base-cuda11.8:0.1.0
+FROM mirrors.tencent.com/todacc/venus-std-ext-cuda11.8-pytorch2.0-tf2.12-py3.10:0.7.0
+#MAINTAINER 维护者信息
+LABEL MAINTAINER="anchorxia"
+LABEL Email="[email protected]"
+LABEL Description="gpu development image, from mirrors.tencent.com/todacc/venus-std-ext-cuda11.8-pytorch2.0-tf2.12-py3.10:0.7.0"
+USER root
+# 安装必须软件
+# RUN GENERIC_REPO_URL="http://mirrors.tencent.com/repository/generic/venus_repo/image_res" \
+# && cd /data/ \
+# && wget -q $GENERIC_REPO_URL/gcc/gcc-11.2.0.zip \
+# && unzip -q gcc-11.2.0.zip \
+# && cd gcc-releases-gcc-11.2.0 \
+# && ./contrib/download_prerequisites \
+# && ./configure --enable-bootstrap --enable-languages=c,c++ --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib \
+# && make --silent -j10 \
+# && make --silent install \
+# && gcc -v \
+# && rm -rf /data/gcc-releases-gcc-11.2.0 /data/gcc-11.2.0.zip
+# RUN yum update -y \
+# && yum install -y epel-release \
+# && yum install -y ffmpeg \
+# && yum install -y Xvfb \
+# && yum install -y centos-release-scl devtoolset-11
+RUN yum install -y wget zsh git curl tmux cmake htop iotop git-lfs zip \
+ && yum install -y autojump autojump-zsh portaudio portaudio-devel \
+ && yum clean all
+USER mqq
+RUN source ~/.bashrc \
+ && GENERIC_REPO_URL="http://mirrors.tencent.com/repository/generic/venus_repo/image_res" \
+ && conda deactivate \
+ # && conda remove -y -n env-2.7.18 --all \
+ # && conda remove -y -n env-3.6.8 --all \
+ # && conda remove -y -n env-3.7.7 --all \
+ # && conda remove -y -n env-3.8.8 --all \
+ # && conda remove -y -n env-3.9.2 --all \
+ # && conda remove -y -n env-novelai --all \
+ && conda create -n projectv python=3.10.6 -y \
+ && conda activate projectv \
+ && pip install venus-sdk -q -i https://mirrors.tencent.com/repository/pypi/tencent_pypi/simple \
+ --extra-index-url https://mirrors.tencent.com/pypi/simple/ \
+ && pip install tensorflow==2.12.0 tensorboard==2.12.0 \
+ && pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 -f https://mirror.sjtu.edu.cn/pytorch-wheels/torch_stable.html -i https://mirrors.bfsu.edu.cn/pypi/web/simple -U \
+ # 安装xformers，支持不同型号gpu
+ && pip install ninja==1.11.1 \
+ # && git clone https://github.com/facebookresearch/xformers.git \
+ # && cd xformers \
+ # && git checkout v0.0.17rc482 \
+ # && git submodule update --init --recursive \
+ # && pip install numpy==1.23.4 pyre-extensions==0.0.23 \
+ # && FORCE_CUDA="1" MAX_JOBS=1 TORCH_CUDA_ARCH_LIST="6.1;7.0;7.5;8.0;8.6" pip install -e . \
+ # && cd .. \
+ # 安装一堆包
+ && pip install --no-cache-dir transformers bitsandbytes decord accelerate xformers omegaconf einops imageio==2.31.1 \
+ && pip install --no-cache-dir pandas h5py matplotlib modelcards pynvml black pytest moviepy torch-tb-profiler scikit-learn librosa ffmpeg easydict webp controlnet_aux mediapipe \
+ && pip install --no-cache-dir Cython easydict gdown infomap insightface ipython librosa onnx onnxruntime onnxsim opencv_python Pillow protobuf pytube PyYAML \
+ && pip install --no-cache-dir requests scipy six tqdm gradio albumentations opencv-contrib-python imageio-ffmpeg pytorch-lightning test-tube \
+ && pip install --no-cache-dir timm addict yapf prettytable safetensors basicsr fvcore pycocotools wandb gunicorn \
+ && pip install --no-cache-dir streamlit webdataset kornia open_clip_torch streamlit-drawable-canvas torchmetrics \
+ # 安装暗水印
+ && pip install --no-cache-dir invisible-watermark==0.1.5 gdown==4.5.3 ftfy==6.1.1 modelcards==0.1.6 \
+ # 安装openmm相关包
+ && pip install--no-cache-dir -U openmim \
+ && mim install mmengine \
+ && mim install "mmcv>=2.0.1" \
+ && mim install "mmdet>=3.1.0" \
+ && mim install "mmpose>=1.1.0" \
+ # jupyters
+ && pip install ipywidgets==8.0.3 \
+ && python -m ipykernel install --user --name projectv --display-name "python(projectv)" \
+ && pip install --no-cache-dir matplotlib==3.6.2 redis==4.5.1 pydantic[dotenv]==1.10.2 loguru==0.6.0 IProgress==0.4 \
+ && pip install --no-cache-dir cos-python-sdk-v5==1.9.22 coscmd==1.8.6.30 \
+ # 必须放在最后pip，避免和jupyter的不兼容
+ && pip install --no-cache-dir markupsafe==2.0.1 \
+ && wget -P /tmp $GENERIC_REPO_URL/cpu/clean-layer.sh \
+ && sh /tmp/clean-layer.sh
+ENV LD_LIBRARY_PATH=/usr/local/lib64:$LD_LIBRARY_PATH
+USER root

MuseV/MMCM/README.md ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ # MMCM
2	+ Process package for multi media, cross multi modal.

MuseV/MMCM/mmcm/__init__.py ADDED Viewed

	@@ -0,0 +1,6 @@

+from .audio import *
+from .data import *
+from .music import *
+from .text import *
+from .vision import *
+from .t2p import *

MuseV/MMCM/mmcm/audio/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/__init__.py ADDED Viewed

	@@ -0,0 +1,9 @@

+from .general.items import Items, Item
+from .emb.emb import MediaMapEmb
+from .emb.h5py_emb import H5pyMediaMapEmb, H5pyMediaMapEmbProxy
+from .media_map.media_map import MediaMap, MetaInfo, MetaInfoList, MediaMapSeq
+from .media_map.media_map_process import get_sub_mediamap_by_clip_idx, get_sub_mediamap_by_stage, get_subseq_by_time
+from .clip.clip import Clip, ClipSeq
+from .clip.clipid import ClipIds, ClipIdsSeq, MatchedClipIds, MatchedClipIdsSeq

MuseV/MMCM/mmcm/data/clip.py ADDED Viewed

	@@ -0,0 +1,324 @@

+from copy import deepcopy
+from typing import Iterable
+import logging
+import numpy as np
+from ..utils.util import convert_class_attr_to_dict
+logger = logging.getLogger(__name__) # pylint: disable=invalid-name
+class Clip(object, Item):
+ """媒体片段, 指转场点与转场点之间的部分"""
+ def __init__(
+ self,
+ time_start,
+ duration,
+ clipid=None,
+ media_type=None,
+ mediaid=None,
+ timepoint_type=None,
+ text=None,
+ stage=None,
+ path=None,
+ duration_num=None,
+ group_time_start=0,
+ group_clipid=None,
+ original_clipid=None,
+ emb=None,
+ multi_factor=None,
+ similar_clipseq=None,
+ rythm: float = None,
+ **kwargs
+ ):
+ """
+ Args:
+ time_start (float): 开始时间,秒为单位,对应该媒体文件的, 和media_map.json上的序号一一对应
+ duration (_type_): 片段持续时间
+ clipid (int, or [int]): 由media_map提供的片段序号, 和media_map.json上的序号一一对应
+ media_type (str, optional): music, video,text, Defaults to None.
+ mediaid (int): 多媒体id, 当clipid是列表时,表示该片段是个融合片段
+ timepoint_type(int, ): 开始点的转场类型. Defaults to None.
+ text(str, optional): 该片段的文本描述,音乐可以是歌词,视频可以是台词,甚至可以是弹幕. Defaults to None.
+ stage(str, optional): 该片段在整个媒体文件中的结构位置,如音乐的intro、chrous、vesa,视频的片头、片尾、开始、高潮、转场等. Defaults to None.
+ path (_type_, optional): 该媒体文件的路径,用于后续媒体读取、处理. Defaults to None.
+ duration_num (_type_, optional): 片段持续帧数, Defaults to None.
+ group_time_start (int, optional): 当多歌曲、多视频剪辑时,group_time_start 表示该片段所对应的子媒体前所有子媒体的片段时长总和。
+ 默认0, 表示只有1个媒体文件. Defaults to 0.
+ group_clipid (int, optional): # MediaInfo.sub_meta_info 中的实际序号.
+ original_clipid (None or [int], optional): 有些片段由其他片段合并,该字段用于片段来源,id是 media_map.json 中的实际序号. Defaults to None.
+ emb (np.array, optional): 片段 综合emb,. Defaults to None.
+ multi_factor (MultiFactorFeature), optional): 多维度特征. Defaults to None.
+ similar_clipseq ([Clip]], optional): 与该片段相似的片段，具体结构待定义. Defaults to None.
+ """
+ self.media_type = media_type
+ self.mediaid = mediaid
+ self.time_start = time_start
+ self.duration = duration
+ self.clipid = clipid
+ self.path = path
+ self.timepoint_type = timepoint_type
+ self.text = text
+ self.stage = stage
+ self.group_time_start = group_time_start
+ self.group_clipid = group_clipid
+ self.duration_num = duration_num
+ self.original_clipid = original_clipid if original_clipid is not None else []
+ self.emb = emb
+ self.multi_factor = multi_factor
+ self.similar_clipseq = similar_clipseq
+ self.rythm = rythm
+ # TODO: 目前谱面中会有一些不必要的中间结果，比较占内存，现在代码里删掉，待后续数据协议确定
+ kwargs = {k: v for k, v in kwargs.items()}
+ self.__dict__.update(kwargs)
+ self.preprocess()
+ def preprocess(self):
+ pass
+ def spread_parameters(self):
+ pass
+ @property
+ def time_end(
+ self,
+ ):
+ return self.time_start + self.duration
+ @property
+ def mvp_clip(self):
+ """读取实际的片段数据为moviepy格式
+ Raises:
+ NotImplementedError: _description_
+ """
+ raise NotImplementedError
+class ClipSeq(object):
+ """媒体片段序列"""
+ ClipClass = Clip
+ def __init__(self, clips) -> None:
+ """_summary_
+ Args:
+ clips ([Clip]]): 媒体片段序列
+ """
+ if not isinstance(clips, list):
+ clips = [clips]
+ if len(clips) == 0:
+ self.clips = []
+ elif isinstance(clips[0], dict):
+ self.clips = [self.ClipClass(**d) for d in clips]
+ else:
+ self.clips = clips
+ def set_clip_value(self, k, v):
+ """给序列中的每一个clip 赋值"""
+ for i in range(len(self.clips)):
+ self.clips[i].__setattr__(k, v)
+ def __len__(
+ self,
+ ):
+ return len(self.clips)
+ def merge(self, other, group_time_start_delta=None, groupid_delta=None):
+ """融合其他ClipSeq。media_info 融合时需要记录 clip 所在的 groupid 和 group_time_start，delta用于表示变化
+ Args:
+ other (ClipSeq): 待融合的ClipSeq
+ group_time_start_delta (float, optional): . Defaults to None.
+ groupid_delta (int, optional): _description_. Defaults to None.
+ """
+ if group_time_start_delta is not None or groupid_delta is not None:
+ for i, clip in enumerate(other):
+ if group_time_start_delta is not None:
+ clip.group_time_start += group_time_start_delta
+ if groupid_delta is not None:
+ clip.groupid += groupid_delta
+ self.clips.extend(other.clips)
+ for i in range(len(self.clips)):
+ self.clips[i].group_clipid = i
+ @property
+ def duration(
+ self,
+ ):
+ """Clip.duration的和
+ Returns:
+ float: 序列总时长
+ """
+ if len(self.clips) == 0:
+ return 0
+ else:
+ return sum([c.duration for c in self.clips])
+ def __getitem__(self, i) -> Clip:
+ """支持索引和切片操作，如果输入是整数则返回Clip，如果是切片，则返回ClipSeq
+ Args:
+ i (int or slice): 索引
+ Raises:
+ ValueError: 需要按照给的输入类型索引
+ Returns:
+ Clip or ClipSeq:
+ """
+ if "int" in str(type(i)):
+ i = int(i)
+ if isinstance(i, int):
+ clip = self.clips[i]
+ return clip
+ elif isinstance(i, Iterable):
+ clips = [self.__getitem__(x) for x in i]
+ clipseq = ClipSeq(clips)
+ return clipseq
+ elif isinstance(i, slice):
+ if i.step is None:
+ step = 1
+ else:
+ step = i.step
+ clips = [self.__getitem__(x) for x in range(i.start, i.stop, step)]
+ clipseq = ClipSeq(clips)
+ return clipseq
+ else:
+ raise ValueError(
+ "unsupported input, should be int or slice, but given {}, type={}".format(
+ i, type(i)
+ )
+ )
+ def insert(self, idx, obj):
+ self.clips.insert(idx, obj)
+ def append(self, obj):
+ self.clips.append(obj)
+ def extend(self, objs):
+ self.clips.extend(objs)
+ @property
+ def duration_seq_emb(
+ self,
+ ):
+ emb = np.array([c.duration for c in self.clips])
+ return emb
+ @property
+ def timestamp_seq_emb(self):
+ emb = np.array([c.time_start for c in self.clips])
+ return emb
+ @property
+ def rela_timestamp_seq_emb(self):
+ emb = self.timestamp_seq_emb / self.duration
+ return emb
+ def get_factor_seq_emb(self, factor, dim):
+ emb = []
+ for c in self.clips:
+ if factor not in c.multi_factor or c.multi_factor[factor] is None:
+ v = np.full(dim, np.inf)
+ else:
+ v = c.multi_factor[factor]
+ emb.append(v)
+ emb = np.stack(emb, axis=0)
+ return emb
+ def semantic_seq_emb(self, dim):
+ return self.get_factor_seq_emb(factor="semantics", dim=dim)
+ def emotion_seq_emb(self, dim):
+ return self.get_factor_seq_emb(factor="emotion", dim=dim)
+ def theme_seq_emb(self, dim):
+ return self.get_factor_seq_emb(factor="theme", dim=dim)
+ def to_dct(
+ self,
+ target_keys=None,
+ ignored_keys=None,
+ ):
+ if ignored_keys is None:
+ ignored_keys = ["kwargs", "audio_path", "lyric_path", "start", "end"]
+ clips = [
+ clip.to_dct(target_keys=target_keys, ignored_keys=ignored_keys)
+ for clip in self.clips
+ ]
+ return clips
+ @property
+ def mvp_clip(self):
+ """读取实际的片段数据为moviepy格式
+ Raises:
+ NotImplementedError: _description_
+ """
+ raise NotImplementedError
+class ClipIds(object):
+ def __init__(
+ self,
+ clipids: list or int,
+ ) -> None:
+ """ClipSeq 中的 Clip序号，主要用于多个 Clip 融合后的 Clip, 使用场景如
+ 1. 一个 MusicClip 可以匹配到多个 VideoClip，VideoClip 的索引便可以使用 ClipIds 定义。
+ Args:
+ clipids (list or int): ClipSeq 中的序号
+ """
+ self.clipids = clipids if isinstance(clipids, list) else [clipids]
+class ClipIdsSeq(object):
+ def __init__(self, clipids_seq: list) -> None:
+ """多个 ClipIds，使用场景可以是
+ 1. 将MediaClipSeq 进行重组，拆分重组成更粗粒度的ClipSeq；
+ Args:
+ clipids_seq (list): 组合后的 ClipIds 列表
+ """
+ self.clipids_seq = (
+ clipids_seq if isinstance(clipids_seq, ClipIds) else [clipids_seq]
+ )
+# TODO: metric后续可能是字典
+class MatchedClipIds(object):
+ def __init__(
+ self, id1: ClipIds, id2: ClipIds, metric: float = None, **kwargs
+ ) -> None:
+ """两种模态数据的片段匹配对，���用场景 可以是
+ 1. 音乐片段和视频片段 之间的匹配关系，
+ Args:
+ id1 (ClipIds): 第一种模态的片段
+ id2 (ClipIds): 第二种模态的片段
+ metric (float): 匹配度量距离
+ """
+ self.id1 = id1 if isinstance(id1, ClipIds) else ClipIds(id1)
+ self.id2 = id2 if isinstance(id2, ClipIds) else ClipIds(id2)
+ self.metric = metric
+ self.__dict__.update(**kwargs)
+class MatchedClipIdsSeq(object):
+ def __init__(self, seq: list, metric: float = None, **kwargs) -> None:
+ """两种模态数据的序列匹配对，使用场景可以是
+ 1. 音乐片段序列和视频片段序列 之间的匹配，每一个元素都是MatchedClipIds:
+ Args:
+ seq (list): 两种模态数据的序列匹配对列表
+ metric (float): 匹配度量距离
+ """
+ self.seq = seq
+ self.metric = metric
+ self.__dict__.update(**kwargs)

MuseV/MMCM/mmcm/data/clip/__init__.py ADDED Viewed

	@@ -0,0 +1,5 @@

+from .clip import Clip, ClipSeq
+from .clipid import ClipIds, MatchedClipIds, ClipIdsSeq, MatchedClipIdsSeq
+from .clip_process import find_idx_by_time, find_idx_by_clip, get_subseq_by_time, get_subseq_by_idx, clip_is_top, clip_is_middle, clip_is_end, abadon_old_return_new, reset_clipseq_id, insert_endclip, insert_startclip, drop_start_end_by_time, complete_clipseq, complete_gap
+from .clip_stat import stat_clipseq_duration
+from .clip_filter import ClipFilter, ClipSeqFilter

MuseV/MMCM/mmcm/data/clip/clip.py ADDED Viewed

	@@ -0,0 +1,197 @@

+from __future__ import annotations
+from copy import deepcopy
+from typing import Iterable, List, Tuple, Dict, Hashable, Any, Union
+import numpy as np
+from ...utils.util import convert_class_attr_to_dict
+from ..general.items import Items, Item
+from .clipid import MatchedClipIds
+import logging
+logger = logging.getLogger(__name__) # pylint: disable=invalid-name
+__all__ = ["Clip", "ClipSeq"]
+class Clip(Item):
+ """媒体片段, 指转场点与转场点之间的部分"""
+ def __init__(
+ self,
+ time_start: float,
+ duration: float,
+ clipid: int = None,
+ media_type: str = None,
+ mediaid: str = None,
+ timepoint_type: str = None,
+ text: str = None,
+ stage: str = None,
+ path: str = None,
+ duration_num: int = None,
+ similar_clipseq: MatchedClipIds = None,
+ dynamic: float = None,
+ **kwargs,
+ ):
+ """
+ Args:
+ time_start (float): 开始时间,秒为单位,对应该媒体文件的, 和media_map.json上的序号一一对应
+ duration (_type_): 片段持续时间
+ clipid (int, or [int]): 由media_map提供的片段序号, 和media_map.json上的序号一一对应
+ media_type (str, optional): music, video,text, Defaults to None.
+ mediaid (int): 多媒体id, 当clipid是列表时,表示该片段是个融合片段
+ timepoint_type(int, ): 开始点的转场类型. Defaults to None.
+ text(str, optional): 该片段的文本描述,音乐可以是歌词,视频可以是台词,甚至可以是弹幕. Defaults to None.
+ stage(str, optional): 该片段在整个媒体文件中的结构位置,如音乐的intro、chrous、vesa,视频的片头、片尾、开始、高潮、转场等. Defaults to None.
+ path (str, optional): 该媒体文件的路径,用于后续媒体读取、处理. Defaults to None.
+ duration_num (_type_, optional): 片段持续帧数, Defaults to None.
+ similar_clipseq ([Clip]], optional): 与该片段相似的片段，具体结构待定义. Defaults to None.
+ """
+ self.media_type = media_type
+ self.mediaid = mediaid
+ self.time_start = time_start
+ self.duration = duration
+ self.clipid = clipid
+ self.path = path
+ self.timepoint_type = timepoint_type
+ self.text = text
+ self.stage = stage
+ self.duration_num = duration_num
+ self.similar_clipseq = similar_clipseq
+ self.dynamic = dynamic
+ self.__dict__.update(**kwargs)
+ def preprocess(self):
+ pass
+ def spread_parameters(self):
+ pass
+ @property
+ def time_end(
+ self,
+ ) -> float:
+ return self.time_start + self.duration
+ def get_emb(self, key: str, idx: int) -> np.float:
+ return self.emb.get_value(key, idx)
+class ClipSeq(Items):
+ """媒体片段序列"""
+ def __init__(self, items: List[Clip] = None):
+ super().__init__(items)
+ self.clipseq = self.data
+ def preprocess(self):
+ pass
+ def set_clip_value(self, k: Hashable, v: Any) -> None:
+ """给序列中的每一个clip 赋值"""
+ for i in range(len(self.clipseq)):
+ self.clipseq[i].__setattr__(k, v)
+ def __len__(
+ self,
+ ) -> int:
+ return len(self.clipseq)
+ @property
+ def duration(
+ self,
+ ) -> float:
+ """Clip.duration的和
+ Returns:
+ float: 序列总时长
+ """
+ if len(self.clipseq) == 0:
+ return 0
+ else:
+ return sum([c.duration for c in self.clipseq])
+ def __getitem__(self, i: Union[int, Iterable]) -> Union[Clip, ClipSeq]:
+ """支持索引和切片操作，如果输入是整数则返回Clip，如果是切片，则返回ClipSeq
+ Args:
+ i (int or slice): 索引
+ Raises:
+ ValueError: 需要按照给的输入类型索引
+ Returns:
+ Clip or ClipSeq:
+ """
+ if "int" in str(type(i)):
+ i = int(i)
+ if isinstance(i, int):
+ clip = self.clipseq[i]
+ return clip
+ elif isinstance(i, Iterable):
+ clipseq = [self.__getitem__(x) for x in i]
+ clipseq = ClipSeq(clipseq)
+ return clipseq
+ elif isinstance(i, slice):
+ if i.step is None:
+ step = 1
+ else:
+ step = i.step
+ clipseq = [self.__getitem__(x) for x in range(i.start, i.stop, step)]
+ clipseq = ClipSeq(clipseq)
+ return clipseq
+ else:
+ raise ValueError(
+ "unsupported input, should be int or slice, but given {}, type={}".format(
+ i, type(i)
+ )
+ )
+ @property
+ def mvp_clip(self):
+ """读取实际的片段数据为moviepy格式
+ Raises:
+ NotImplementedError: _description_
+ """
+ raise NotImplementedError
+ @property
+ def duration_seq_emb(
+ self,
+ ) -> np.array:
+ emb = np.array([c.duration for c in self.clipseq])
+ return emb
+ @property
+ def timestamp_seq_emb(self) -> np.array:
+ emb = np.array([c.time_start for c in self.clipseq])
+ return emb
+ @property
+ def rela_timestamp_seq_emb(self) -> np.array:
+ duration_seq = [c.duration for c in self.clipseq]
+ emb = np.cumsum(duration_seq) / self.duration
+ return emb
+ def get_emb(self, key: str, idx: int) -> np.float:
+ clip_start_idx = self.clipseq[0].clipid
+ clip_end_idx = self.clipseq[-1].clipid
+ # TODO: 待修改为更通用的形式
+ if idx is None:
+ idx = range(clip_start_idx, clip_end_idx + 1)
+ elif isinstance(idx, int):
+ idx += clip_start_idx
+ elif isinstance(idx, Iterable):
+ idx = [x + clip_start_idx for x in idx]
+ else:
+ raise ValueError(
+ f"idx only support None, int, Iterable, but given {idx},type is {type(idx)}"
+ )
+ return self.emb.get_value(key, idx=idx)

MuseV/MMCM/mmcm/data/clip/clip_filter.py ADDED Viewed

	@@ -0,0 +1,46 @@

+from typing import Callable, List, Union
+from .clip import ClipSeq
+from .clip_process import reset_clipseq_id
+class ClipFilter(object):
+ """clip滤波器，判断 Clip 是否符合标准
+ Args:
+ object (bool): 是否符合输入函数
+ """
+ def __init__(self, funcs: Union[Callable, List[Callable]], logic_func: Callable=all) -> None:
+ """多个 clip 判断函数，通过 逻辑与、或当综合结果。
+ Args:
+ funcs (list of func): 列表判断函数
+ logic_func (func, optional): all or any. Defaults to all.
+ """
+ self.funcs = funcs if isinstance(funcs, list) else [funcs]
+ self.logic_func = logic_func
+ def __call__(self, clip) -> bool:
+ flag = [func(clip) for func in self.funcs]
+ flag = self.logic_func(flag)
+ return flag
+# TODO
+class ClipSeqFilter(object):
+ def __init__(self, filter: Callable) -> None:
+ self.filter = filter
+ def __call__(self, clipseq: ClipSeq) -> ClipSeq:
+ new_clipseq = []
+ n_clipseq = len(clipseq)
+ for i in range(n_clipseq):
+ clip = clipseq[i]
+ if self.filter(clip):
+ new_clipseq.append(clip)
+ new_clipseq = reset_clipseq_id(new_clipseq)
+ # logger.debug("ClipSeqFilter: clipseq length before={}, after={}".format(n_clipseq, len(new_clipseq)))
+ return new_clipseq

MuseV/MMCM/mmcm/data/clip/clip_fusion.py ADDED Viewed

	@@ -0,0 +1,64 @@

+from typing import List, Union, Callable
+from copy import deepcopy
+from .clip import ClipSeq
+from .clip_process import reset_clipseq_id
+import logging
+logger = logging.getLogger(__name__) # pylint: disable=invalid-name
+# TODO: 不同类型的clip需要不同的融合方式
+def fuse_clips(s1: ClipSeq, s2: ClipSeq) -> ClipSeq:
+ """合并2个clip
+ Args:
+ s1 (Clip):
+ s2 (Clip):
+ Returns:
+ Clip: 合并后Clip
+ """
+ if not isinstance(s2, list):
+ s2 = [s2]
+ s1 = deepcopy(s1)
+ for other_clip in s2:
+ s1.duration += other_clip.duration
+ if s1.stage is not None and other_clip.stage is not None:
+ # TODO：如何保留融合的clip信息
+ s1.stage = "{}_{}".format(s1.stage, other_clip.stage)
+ s1.origin_clipid.extend(other_clip.origin_clipid)
+ if s1.timepoint_type is not None and other_clip.timepoint_type is not None:
+ s1.timepoint_type = "{}_{}".format(
+ s1.timepoint_type, other_clip.timepoint_type
+ )
+ return s1
+# TODO: 不同的filter和fusion函数不适用同一种流程，待优化
+class ClipSeqFusion(object):
+ """_summary_
+ Args:
+ object (_type_): _description_
+ """
+ def __init__(self, filter: Callable, fuse_func: Callable = None) -> None:
+ self.filter = filter
+ self.fuse_func = fuse_func
+ def __call__(self, clipseq: ClipSeq) -> ClipSeq:
+ new_clipseq = []
+ n_clipseq = len(clipseq)
+ for i in range(n_clipseq):
+ clip = clipseq[i]
+ if self.filter(clip):
+ new_clipseq.append(clip)
+ new_clipseq = reset_clipseq_id(new_clipseq)
+ logger.debug(
+ "ClipSeqFilter: clipseq length before={}, after={}".format(
+ n_clipseq, len(new_clipseq)
+ )
+ )
+ return new_clipseq

MuseV/MMCM/mmcm/data/clip/clip_process.py ADDED Viewed

	@@ -0,0 +1,366 @@

+from functools import partial
+from copy import deepcopy
+from typing import Iterable, List, Tuple, Union
+import bisect
+import logging
+import numpy as np
+from .clip import Clip, ClipSeq
+from .clipid import ClipIds, ClipIdsSeq, MatchedClipIds, MatchedClipIdsSeq
+logger = logging.getLogger(__name__) # pylint: disable=invalid-name
+__all__ = [
+ "find_idx_by_rela_time",
+ "find_idx_by_time",
+ "find_idx_by_clip",
+ "get_subseq_by_time",
+ "get_subseq_by_idx",
+ "clip_is_top",
+ "clip_is_middle",
+ "clip_is_end",
+ "abadon_old_return_new",
+ "reset_clipseq_id",
+ "insert_endclip",
+ "insert_startclip",
+ "drop_start_end_by_time",
+ "complete_clipseq",
+ "complete_gap",
+ "get_subseq_by_stages",
+ "find_time_by_stage",
+]
+def find_idx_by_rela_time(clipseq: ClipSeq, timepoint: float) -> int:
+ clipseq_duration = clipseq.duration
+ timepoint = clipseq_duration * timepoint
+ clipseq_times = [c.duration for c in clipseq]
+ clipseq_times.insert(0, 0)
+ clipseq_times = np.cumsum(clipseq_times)
+ idx = bisect.bisect_right(clipseq_times, timepoint)
+ idx = min(max(0, idx - 1), len(clipseq) - 1)
+ return idx
+def find_idx_by_time(clipseq: ClipSeq, timepoint: float) -> int:
+ """寻找指定时间timepoint 在 clipseq 中的片段位置
+ Args:
+ clipseq (ClipSeq): 待寻找的片段序列
+ timepoint (float): 指定时间位置
+ Returns:
+ _type_: _description_
+ """
+ clipseq_times = [c.time_start for c in clipseq]
+ idx = bisect.bisect_right(clipseq_times, timepoint)
+ idx = min(max(0, idx - 1), len(clipseq) - 1)
+ return idx
+def find_idx_by_clip(clipseq: ClipSeq, clip: Clip, eps: float = 1e-4) -> int:
+ """通过计算目标clip和clipseq中所有候选clip的交集占比来找最近clip
+ Args:
+ clipseq (ClipSeq): 候选clip序列
+ clip (Clip): 目标clip
+ eps (float, optional): 最小交集占比. Defaults to 1e-4.
+ Returns:
+ int: 目标clip在候选clip序列的位置，若无则为None
+ """
+ timepoints = np.array([[c.time_start, c.time_start + c.duration] for c in clipseq])
+ clip_time_start = clip.time_start
+ clip_duraiton = clip.duration
+ clip_time_end = clip_time_start + clip_duraiton
+ max_time_start = np.maximum(timepoints[:, 0], clip_time_start)
+ min_time_end = np.minimum(timepoints[:, 1], clip_time_end)
+ intersection = min_time_end - max_time_start
+ intersection_ratio = intersection / clip_duraiton
+ max_intersection_ratio = np.max(intersection_ratio)
+ idx = np.argmax(intersection_ratio) if max_intersection_ratio > eps else None
+ return idx
+def get_subseq_by_time(
+ clipseq: ClipSeq,
+ start: float = 0,
+ duration: float = None,
+ end: float = 1,
+ eps: float = 1e-2,
+) -> ClipSeq:
+ """根据时间对媒体整体做掐头去尾，保留中间部分。，也可以是大于1的数。
+ start和end如果是0-1的小数，则认为是是相对时间位置，实际位置会乘以duration；
+ start和end如果是大于1的数，则是绝对时间位置。
+ Args:
+ clipseq (ClipSeq): 待处理的序列
+ start (float,): 保留部分的开始，. Defaults to 0.
+ duration (float, optional): 媒体文件当前总时长
+ end (float, optional): 保留部分的结尾. Defaults to 1.
+ Returns:
+ ClipSeq: 处理后的序列
+ """
+ if (start == 0 or start is None) and (end is None or end == 1):
+ logger.warning("you should set start or end")
+ return clipseq
+ if duration is None:
+ duration = clipseq.duration
+ if start is None or start == 0:
+ clip_start_idx = 0
+ else:
+ if start < 1:
+ start = start * duration
+ clip_start_idx = find_idx_by_time(clipseq, start)
+ if end is None or end == 1 or np.abs(duration - end) < eps:
+ clip_end_idx = -1
+ else:
+ if end < 1:
+ end = end * duration
+ clip_end_idx = find_idx_by_time(clipseq, end)
+ if clip_end_idx != -1 and clip_start_idx >= clip_end_idx:
+ logger.error(
+ f"clip_end_idx({clip_end_idx}) should be > clip_start_idx({clip_start_idx})"
+ )
+ subseq = get_subseq_by_idx(clipseq, clip_start_idx, clip_end_idx)
+ return subseq
+def get_subseq_by_idx(clipseq: ClipSeq, start: int = None, end: int = None) -> ClipSeq:
+ """通过指定索引范围，切片子序列
+ Args:
+ clipseq (ClipSeq):
+ start (int, optional): 开始索引. Defaults to None.
+ end (int, optional): 结尾索引. Defaults to None.
+ Returns:
+ _type_: _description_
+ """
+ if start is None and end is None:
+ return clipseq
+ if start is None:
+ start = 0
+ if end is None:
+ end = len(clipseq)
+ return clipseq[start:end]
+def clip_is_top(clip: Clip, total: float, th: float = 0.1) -> bool:
+ """判断Clip是否属于开始部分
+ Args:
+ clip (Clip):
+ total (float): 所在ClipSeq总时长
+ th (float, optional): 开始范围的截止位置. Defaults to 0.05.
+ Returns:
+ Bool: 是不是头部Clip
+ """
+ clip_time = clip.time_start
+ if clip_time / total <= th:
+ return True
+ else:
+ return False
+def clip_is_end(clip: Clip, total: float, th: float = 0.9) -> bool:
+ """判断Clip是否属于结尾部分
+ Args:
+ clip (Clip):
+ total (float): 所在ClipSeq总时长
+ th (float, optional): 结尾范围的开始位置. Defaults to 0.9.
+ Returns:
+ Bool: 是不是尾部Clip
+ """
+ clip_time = clip.time_start + clip.duration
+ if clip_time / total >= th:
+ return True
+ else:
+ return False
+def clip_is_middle(
+ clip: Clip, total: float, start: float = 0.05, end: float = 0.9
+) -> bool:
+ """判断Clip是否属于中间部分
+ Args:
+ clip (Clip):
+ total (float): 所在ClipSeq总时长
+ start (float, optional): 中间范围的开始位置. Defaults to 0.05.
+ start (float, optional): 中间范围的截止位置. Defaults to 0.9.
+ Returns:
+ Bool: 是不是中间Clip
+ """
+ if start >= 0 and start < 1:
+ start = total * start
+ if end > 0 and end <= 1:
+ end = total * end
+ clip_time_start = clip.time_start
+ clip_time_end = clip.time_start + clip.duration
+ if (clip_time_start >= start) and (clip_time_end <= end):
+ return True
+ else:
+ return False
+def abadon_old_return_new(s1: Clip, s2: Clip) -> Clip:
+ """特殊的融合方式
+ Args:
+ s1 (Clip): 靠前的clip
+ s2 (Clip): 靠后的clip
+ Returns:
+ Clip: 融合后的Clip
+ """
+ return s2
+# TODO：待确认是否要更新clipid，不方便对比着json进行debug
+def reset_clipseq_id(clipseq: ClipSeq) -> ClipSeq:
+ for i in range(len(clipseq)):
+ if isinstance(clipseq[i], dict):
+ clipseq[i]["clipid"] = i
+ else:
+ clipseq[i].clipid = i
+ return clipseq
+def insert_startclip(clipseq: ClipSeq) -> ClipSeq:
+ """给ClipSeq插入一个开始片段。
+ Args:
+ clipseq (ClipSeq):
+ clip_class (Clip, optional): 插入的Clip类型. Defaults to Clip.
+ Returns:
+ ClipSeq: 插入头部Clip的新ClipSeq
+ """
+ if clipseq[0].time_start > 0:
+ start = clipseq.ClipClass(
+ time_start=0, duration=round(clipseq[0].time_start, 3), timepoint_type=0
+ )
+ clipseq.insert(0, start)
+ clipseq = reset_clipseq_id(clipseq)
+ return clipseq
+def insert_endclip(clipseq: ClipSeq, duration: float) -> ClipSeq:
+ """给ClipSeq插入一个尾部片段。
+ Args:
+ clipseq (ClipSeq):
+ duration(float, ): 序列的总时长
+ clip_class (Clip, optional): 插入的Clip类型. Defaults to Clip.
+ Returns:
+ ClipSeq: 插入尾部Clip的新ClipSeq
+ """
+ clipseq_endtime = clipseq[-1].time_start + clipseq[-1].duration
+ if duration - clipseq_endtime > 1:
+ end = clipseq.ClipClass(
+ time_start=round(clipseq_endtime, 3),
+ duration=round(duration - clipseq_endtime, 3),
+ timepoint_type=0,
+ )
+ clipseq.append(end)
+ clipseq = reset_clipseq_id(clipseq)
+ return clipseq
+def drop_start_end_by_time(
+ clipseq: ClipSeq, start: float, end: float, duration: float = None
+):
+ return get_subseq_by_time(clipseq=clipseq, start=start, end=end, duration=duration)
+def complete_clipseq(
+ clipseq: ClipSeq, duration: float = None, gap_th: float = 2
+) -> ClipSeq:
+ """绝大多数需要clipseq中的时间信息是连续、完备的，有时候是空的，需要补足的部分。
+ 如歌词时间戳生成的music_map缺头少尾、中间有空的部分。
+ Args:
+ clipseq (ClipSeq): 待补集的序列
+ duration (float, optional): 整个序列持续时间. Defaults to None.
+ gap_th (float, optional): 有时候中间空隙过短就会被融合到上一个片段中. Defaults to 2.
+ Returns:
+ ClipSeq: 补集后的序列，时间连续、完备。
+ """
+ if isinstance(clipseq, list):
+ clipseq = ClipSeq(clipseq)
+ return complete_clipseq(clipseq=clipseq, duration=duration, gap_th=gap_th)
+ clipseq = complete_gap(clipseq, th=gap_th)
+ clipseq = insert_startclip(clipseq)
+ if duration is not None:
+ clipseq = insert_endclip(clipseq, duration)
+ return clipseq
+def complete_gap(clipseq: ClipSeq, th: float = 2) -> ClipSeq:
+ """generate blank clip timepoint = 0，如果空白时间过短，则空白附到上一个歌词片段中。
+ Args:
+ clipseq (ClipSeq): 原始的歌词生成的MusicClipSeq
+ th (float, optional): 有时候中间空隙过短就会被融合到上一个片段中. Defaults to 2.
+ Returns:
+ ClipSeq: 补全后的
+ """
+ gap_clipseq = []
+ clipid = 0
+ for i in range(len(clipseq) - 1):
+ time_start = clipseq[i].time_start
+ duration = clipseq[i].duration
+ time_end = time_start + duration
+ next_time_start = clipseq[i + 1].time_start
+ time_diff = next_time_start - time_end
+ if time_diff >= th:
+ blank_clip = clipseq.ClipClass(
+ time_start=time_end,
+ duration=time_diff,
+ timepoint_type=0,
+ clipid=clipid,
+ )
+ gap_clipseq.append(blank_clip)
+ clipid += 1
+ else:
+ clipseq[i].duration = next_time_start - time_start
+ clipseq.extend(gap_clipseq)
+ clipseq.clips = sorted(clipseq.clips, key=lambda clip: clip.time_start)
+ reset_clipseq_id(clipseq)
+ return clipseq
+def find_time_by_stage(
+ clipseq: ClipSeq, stages: Union[str, List[str]] = None
+) -> Tuple[float, float]:
+ if isinstance(stages, list):
+ stages = [stages]
+ for clip in clipseq:
+ if clip.stage in stages:
+ return clip.time_start, clip.time_end
+ return None, None
+def get_subseq_by_stages(clipseq: ClipSeq, stages: Union[str, List[str]]) -> ClipSeq:
+ if isinstance(stages, List):
+ stages = [stages]
+ start, _ = find_time_by_stage(clipseq, stages[0])
+ _, end = find_time_by_stage(clipseq, stages[-1])
+ if start1 is None:
+ start1 = 0
+ if end2 is None:
+ end2 = clipseq.duration
+ subseq = get_subseq_by_time(clipseq=clipseq, start=start, end=end)
+ return subseq

MuseV/MMCM/mmcm/data/clip/clip_stat.py ADDED Viewed

	@@ -0,0 +1,13 @@

+from typing import Tuple
+import numpy as np
+from .clip import ClipSeq
+def stat_clipseq_duration(
+ clipseq: ClipSeq,
+) -> Tuple[np.array, np.array]:
+ clip_duration = [clip.duration for clip in clipseq]
+ (hist, bin_edges) = np.histogram(clip_duration)
+ return hist, bin_edges

MuseV/MMCM/mmcm/data/clip/clipid.py ADDED Viewed

	@@ -0,0 +1,70 @@

+from __future__ import annotations
+from typing import Union, List
+__all__ = [
+ "ClipIds",
+ "ClipIdsSeq",
+ "MatchedClipIds",
+ "MatchedClipIdsSeq",
+]
+class ClipIds(object):
+ def __init__(
+ self,
+ clipids: Union[int, List[int]],
+ ) -> None:
+ """ClipSeq 中的 Clip序号，主要用于多个 Clip 融合后的 Clip, 使用场景如
+ 1. 一个 MusicClip 可以匹配到多个 VideoClip，VideoClip 的索引便可以使用 ClipIds 定义。
+ Args:
+ clipids (list or int): ClipSeq 中的序号
+ """
+ self.clipids = clipids if isinstance(clipids, list) else [clipids]
+class ClipIdsSeq(object):
+ def __init__(self, clipids_seq: List[ClipIds]) -> None:
+ """多个 ClipIds，使用场景可以是
+ 1. 将MediaClipSeq 进行重组，拆分重组成更粗粒度的ClipSeq；
+ Args:
+ clipids_seq (list): 组合后的 ClipIds 列表
+ """
+ self.clipids_seq = (
+ clipids_seq if isinstance(clipids_seq, ClipIds) else [clipids_seq]
+ )
+# TODO: metric后续可能是字典
+class MatchedClipIds(object):
+ def __init__(
+ self, id1: ClipIds, id2: ClipIds, metric: float = None, **kwargs
+ ) -> None:
+ """两种模态数据的片段匹配对，使用场景 可以是
+ 1. 音乐片段和视频片段 之间的匹配关系，
+ Args:
+ id1 (ClipIds): 第一种模态的片段
+ id2 (ClipIds): 第二种模态的片段
+ metric (float): 匹配度量距离
+ """
+ self.id1 = id1 if isinstance(id1, ClipIds) else ClipIds(id1)
+ self.id2 = id2 if isinstance(id2, ClipIds) else ClipIds(id2)
+ self.metric = metric
+ self.__dict__.update(**kwargs)
+class MatchedClipIdsSeq(object):
+ def __init__(self, seq: List[MatchedClipIds], metric: float = None, **kwargs) -> None:
+ """两种模态数据的序列匹配对，使用场景可以是
+ 1. 音乐片段序列和视频片段序列 之间的匹配，每一个元素都是MatchedClipIds:
+ Args:
+ seq (list): 两种模态数据的序列匹配对列表
+ metric (float): 匹配度量距离
+ """
+ self.seq = seq
+ self.metric = metric
+ self.__dict__.update(**kwargs)

MuseV/MMCM/mmcm/data/crawl/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/crawl/download.py ADDED Viewed

	@@ -0,0 +1,72 @@

+from collections import namedtuple
+from typing import NamedTuple, Tuple, List
+import logging
+import os
+import numpy as np
+import subprocess
+import requests
+import wget
+from .youtube import download_youtube
+from .flicker import download_flickr
+from .ffmpeg import ffmpeg_load
+logger = logging.getLogger(__name__)
+# DownloadStatus = namedtuple("DownloadStatus", ["status_code", "msg"])
+status_code = {0: "download: succ",
+ -1: "download: failed",
+ -2: "clip: failed",
+ -3: "directory not exists",
+ -4: "skip task",
+ - 404: "param error"}
+def download_with_request(url, path):
+ res = requests.get(url)
+ if res.status_code == '200' or res.status_code == 200:
+ with open(path, "wb") as f:
+ f.write(res.content)
+ else:
+ print('request failed')
+ return path
+def download_video(url, save_path:str=None, save_dir:str=None, basename:str=None, filename:str=None, format:str=None, data_type: str="wget", **kwargs) -> Tuple[int, str]:
+ if save_path is None:
+ if basename is None:
+ basename = f"{filename}.{format}"
+ save_path = os.path.join(save_dir, basename)
+ if save_dir is None:
+ save_dir = os.path.dirname(save_path)
+ if basename is None:
+ basename = os.path.basename(save_path)
+ if filename is None:
+ filename, format = os.path.splitext(basename)
+ os.makedirs(save_dir, exist_ok=True)
+ if os.path.exists(save_path):
+ return (-4, save_path)
+ try:
+ if data_type == "requests":
+ save_path = download_with_request(url=url, path=save_path)
+ elif data_type == "wget":
+ save_path = wget.download(url=url, out=save_path)
+ elif data_type == "youtube":
+ save_path = download_youtube(url, format=format, save_dir=save_dir, filename=basename)
+ elif data_type == "flickr":
+ save_path = download_flickr(url, save_path)
+ elif data_type == "ffmpeg":
+ code = ffmpeg_load(url=url, save_path=save_path)
+ else:
+ raise ValueError(f"data_type shoulbe one of [wget, youtube, flickr, ffmpeg], but given {data_type}")
+ except Exception as e:
+ logger.error("failed download file {} to {} failed!".format(url, save_path))
+ logger.exception(e)
+ return (-1, None)
+ return (0, save_path)

MuseV/MMCM/mmcm/data/crawl/error.py ADDED Viewed

	@@ -0,0 +1,20 @@

+class SubprocessError(Exception):
+ """
+ Exception object that contains information about an error that occurred
+ when running a command line command with a subprocess.
+ """
+ def __init__(self, cmd, return_code, stdout, stderr, *args):
+ msg = 'Got non-zero exit code ({1}) from command "{0}": {2}'
+ if stderr.strip():
+ err_msg = stderr
+ else:
+ err_msg = stdout
+ msg = msg.format(cmd[0], return_code, err_msg)
+ self.cmd = cmd
+ self.cmd_return_code = return_code
+ self.cmd_stdout = stdout
+ self.cmd_stderr = stderr
+ super(SubprocessError, self).__init__(msg, *args)

MuseV/MMCM/mmcm/data/crawl/ffmpeg.py ADDED Viewed

	@@ -0,0 +1,39 @@

+import subprocess
+from .error import SubprocessError
+class FfmpegInvalidURLError(Exception):
+ """
+ Exception raised when a 4XX or 5XX error is returned when making a request
+ """
+ def __init__(self, url, error, *args):
+ self.url = url
+ self.error = error
+ msg = 'Got error when making request to "{}": {}'.format(url, error)
+ super(FfmpegInvalidURLError, self).__init__(msg, *args)
+def ffmpeg_load(url: str, save_path: str) -> str:
+ def run(cmd):
+ proc = subprocess.Popen(
+ cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+ stdout, stderr = proc.communicate()
+ return_code = proc.returncode
+ if return_code != 0:
+ raise SubprocessError(
+ cmd, return_code, stdout.decode(), stderr.decode())
+ return return_code
+ command = ['ffmpeg', '-n', '-i', url, '-t', '10', '-f', 'mp4',
+ '-r', '30', '-vcodec', 'h264', save_path, '-loglevel', 'error']
+ code = run(command)
+ return code

MuseV/MMCM/mmcm/data/crawl/flicker.py ADDED Viewed

	@@ -0,0 +1,22 @@

+import os
+from .ffmpeg import ffmpeg_load
+def extract_flickr_id(url):
+ return url.strip('/').split('/')[-4]
+def download_flickr(url: str, save_path: str) -> str:
+ code = -1
+ code = ffmpeg_load(url=url,
+ save_path=save_path)
+ if code == 0:
+ return (code, save_path)
+ # only retry when failed!
+ flickr_id = extract_flickr_id(url)
+ url = 'https://www.flickr.com/video_download.gne?id={}'.format(
+ flickr_id)
+ code = ffmpeg_load(url=url,
+ save_path=save_path)
+ return save_path

MuseV/MMCM/mmcm/data/crawl/youtube.py ADDED Viewed

	@@ -0,0 +1,13 @@

+import os
+from pytube import YouTube
+def download_youtube(url, format, save_dir, filename):
+ youtube = YouTube(url)
+ streams = youtube.streams.filter(progressive=True,
+ file_extension=format)
+ save_path = streams.get_highest_resolution().download(output_path=save_dir,
+ filename=filename)
+ return save_path

MuseV/MMCM/mmcm/data/emb/__init__.py ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ from .emb import *
2	+ from .h5py_emb import H5pyMediaMapEmb, H5pyMediaMapEmbProxy

MuseV/MMCM/mmcm/data/emb/emb.py ADDED Viewed

	@@ -0,0 +1,104 @@

+"""用于将 mediamap中的emb存储独立出去，仍处于开发中
+"""
+import logging
+import numpy as np
+logger = logging.getLogger(__name__) # pylint: disable=invalid-name
+__all__ = ["MediaMapEmb"]
+class MediaMapEmb(object):
+ def __init__(self, path: str) -> None:
+ """
+ OfflineEmb = {
+ "overall_algo": Emb, # 整个文件的Emb
+ # 整个文件的多维度 Emb
+ "theme": np.array, # 主题，
+ "emotion_algo": np.array, # 情绪，
+ "semantic_algo": np.array, # 语义
+ "clips_overall_algo": np.array, n_clip x clip_emb
+ "clips_emotion_algo": np.array, n_clip x clip_emb
+ "clips_semantic_algo": np.array, n_clip x clip_emb
+ "clips_theme_algo": np.array, n_clip x clip_emb
+ "scenes_overall_algo": np.array, n_scenes x scene_emb
+ "scenes_emotion_algo": np.array, n_scenes x scene_emb
+ "scenes_semantic_algo": np.array, n_scenes x scene_emb
+ "scenes_theme_algo": E np.arraymb, n_scenes x scene_emb
+ # 片段可以是转场切分、MusicStage等, clips目前属于转场切分片段
+ # 若后续需要新增段落分割，可以和clips同级新增 stage字段。
+ "frames_overall_algo": np.array, n_frames x frame_emb
+ "frames_emotion_algo": np.array, n_frames x frame_emb
+ "frames_semantic_algo": np.array, n_frames x frame_emb
+ "frames_theme_algo": np.array, n_frames x frame_emb
+ "frames_objs": {
+ "frame_id": { #
+ "overall_algo": np.array, n_objs x obj_emb
+ "emotion_algo": np.array, n_objs x obj_emb
+ "semantic_algo": np.array, n_objs x obj_emb
+ "theme_algo": np.array, n_objs x obj_emb
+ }
+ }
+ "roles_algo": {
+ "roleid": np.array, n x obj_emb
+ }
+ }
+ Args:
+ path (str): hdf5 存储路径
+ """
+ self.path = path
+ def get_value(self, key, idx=None):
+ raise NotImplementedError
+ def __getitem__(self, key):
+ return self.get_value(key)
+ def get_media(self, factor, algo):
+ return self.get_value(f"{factor}_{algo}")
+ def get_clips(self, factor, algo, idx=None):
+ return self.get_value(f"clips_{factor}_{algo}", idx=idx)
+ def get_frames(self, factor, algo, idx=None):
+ return self.get_value(f"frames_{factor}_{algo}", idx=idx)
+ def get_frame_objs(self, frame_idx, factor, algo, idx=None):
+ return self.get_value(["frames_objs", frame_idx, f"{factor}_{algo}"], idx=idx)
+ def set_value(self, key, value, idx=None):
+ raise NotImplementedError
+ def set_media(self, factor, value, algo):
+ self.set_value([f"{factor}_{algo}"], value)
+ def set_clips(self, factor, value, algo, idx=None):
+ self.set_value([f"clips_{factor}_{algo}"], value, idx=idx)
+ def set_frames(self, factor, value, algo, idx=None):
+ self.set_value([f"frames_{factor}_{algo}"], value)
+ def set_frame_objs(self, frame_idx, factor, value, algo, idx=None):
+ return self.set_value(
+ ["frames_objs", frame_idx, f"{factor}_{algo}"], value, idx=idx
+ )
+ def set_roles(self, algo, value, idx=None):
+ return self.set_value(f"roles_{algo}", value, idx=idx)
+ def get_roles(self, algo, idx=None):
+ return self.get_value(f"roles_{algo}", idx=idx)
+ def __setitem__(self, key, value):
+ self.set_value(self, key, value)
+class MediaMapEmbProxy(MediaMapEmb):
+ pass

MuseV/MMCM/mmcm/data/emb/h5py_emb.py ADDED Viewed

	@@ -0,0 +1,119 @@

+from typing import Union, List
+import logging
+import h5py
+import numpy as np
+from .emb import MediaMapEmb
+logger = logging.getLogger(__name__) # pylint: disable=invalid-name
+__all__ = ["H5pyMediaMapEmb", "save_value_with_h5py"]
+def save_value_with_h5py(
+ path: str,
+ value: Union[np.ndarray, None],
+ key: str,
+ idx: Union[int, List[int]] = None,
+ dtype=None,
+ shape=None,
+ overwrite: bool = False,
+):
+ with h5py.File(path, "a") as f:
+ if dtype is None:
+ dtype = value.dtype
+ if shape is None:
+ shape = value.shape
+ del_key = False
+ if key in f:
+ if overwrite:
+ del_key = True
+ if f[key].dtype != h5py.special_dtype(vlen=str):
+ if f[key].shape != value.shape:
+ del_key = True
+ if del_key:
+ del f[key]
+ if key not in f:
+ f.create_dataset(key, shape=shape, dtype=dtype)
+ if idx is None:
+ f[key][...] = value
+ else:
+ f[key][idx] = value
+class H5pyMediaMapEmb(MediaMapEmb):
+ def __init__(self, path: str) -> None:
+ """
+ OfflineEmb = {
+ "overall_algo": Emb, # 整个文件的Emb
+ # 整个文件的多维度 Emb
+ "theme": np.array, # 主题，
+ "emotion_algo": np.array, # 情绪，
+ "semantic_algo": np.array, # 语义
+ "clips_overall_algo": np.array, n_clip x clip_emb
+ "clips_emotion_algo": np.array, n_clip x clip_emb
+ "clips_semantic_algo": np.array, n_clip x clip_emb
+ "clips_theme_algo": np.array, n_clip x clip_emb
+ "scenes_overall_algo": np.array, n_scenes x scene_emb
+ "scenes_emotion_algo": np.array, n_scenes x scene_emb
+ "scenes_semantic_algo": np.array, n_scenes x scene_emb
+ "scenes_theme_algo": E np.arraymb, n_scenes x scene_emb
+ # 片段可以是转场切分、MusicStage等, clips目前属于转场切分片段
+ # 若后续需要新增段落分割，可以和clips同级新增 stage字段。
+ "frames_overall_algo": np.array, n_frames x frame_emb
+ "frames_emotion_algo": np.array, n_frames x frame_emb
+ "frames_semantic_algo": np.array, n_frames x frame_emb
+ "frames_theme_algo": np.array, n_frames x frame_emb
+ "frames_objs_algo": {
+ "frame_id_algo": { #
+ "overall_algo": np.array, n_objs x obj_emb
+ "emotion_algo": np.array, n_objs x obj_emb
+ "semantic_algo": np.array, n_objs x obj_emb
+ "theme_algo": np.array, n_objs x obj_emb
+ }
+ }
+ "roles_algo": {
+ "roleid": np.array, n x obj_emb
+ }
+ }
+ Args:
+ path (str): hdf5 存储路径
+ """
+ super().__init__(path)
+ # 待优化支持 with open 的方式来读写
+ self.f = h5py.File(path, "a")
+ def _keys_index(self, key):
+ if not isinstance(key, list):
+ key = [key]
+ key = "/".join([str(x) for x in key if x is not None])
+ return key
+ def get_value(self, key, idx=None):
+ new_key = self._keys_index(key)
+ if idx is None:
+ data = np.array(self.f[new_key])
+ else:
+ data = np.array(self.f[new_key][idx])
+ return data
+ def set_value(self, key, value, idx=None):
+ new_key = self._keys_index(key)
+ if new_key not in self.f:
+ self.f.create_dataset(new_key, shape=value.shape, dtype=value.dtype)
+ if idx is None:
+ self.f[new_key][...] = value
+ else:
+ self.f[new_key][idx] = value
+ def close(self):
+ self.f.close()
+class H5pyMediaMapEmbProxy(H5pyMediaMapEmb):
+ pass

MuseV/MMCM/mmcm/data/emb/json_emb.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/emb/numpy_emb.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/extract_feature/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/extract_feature/base_extract_feature.py ADDED Viewed

	@@ -0,0 +1,28 @@

+from typing import List, Union, Any
+import torch
+from torch import nn
+import numpy as np
+import h5py
+class BaseFeatureExtractor(nn.Module):
+ def __init__(self, device: str = "cpu", dtype=torch.float32, name: str = None):
+ super().__init__()
+ self.device = device
+ self.dtype = dtype
+ self.name = name
+ def extract(
+ self, data: Any, return_type: Union[str, str] = "numpy"
+ ) -> Union[np.ndarray, torch.tensor]:
+ raise NotADirectoryError
+ def __call__(self, *args: Any, **kwds: Any) -> Any:
+ return self.extract(*args, **kwds)
+ def save_with_h5py(self, f: Union[h5py.File, str], *args, **kwds):
+ raise NotImplementedError
+ def forward(self, *args: Any, **kwds: Any) -> Any:
+ return self.extract(*args, **kwds)

MuseV/MMCM/mmcm/data/general/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ from .items import Items

MuseV/MMCM/mmcm/data/general/items.py ADDED Viewed

	@@ -0,0 +1,69 @@

+from collections import UserList
+from collections.abc import Iterable
+from typing import Iterator, Any, List
+from ...utils.util import convert_class_attr_to_dict
+__all__ = ["Item", "Items"]
+class Item(object):
+ def __init__(self) -> None:
+ pass
+ def to_dct(self, target_keys: List[str] = None, ignored_keys: List[str] = None):
+ base_ignored_keys = [
+ "kwargs",
+ ]
+ if isinstance(ignored_keys, list):
+ base_ignored_keys.extend(ignored_keys)
+ elif isinstance(ignored_keys, str):
+ base_ignored_keys.append(ignored_keys)
+ else:
+ pass
+ return convert_class_attr_to_dict(
+ self, target_keys=target_keys, ignored_keys=base_ignored_keys
+ )
+ def preprocess(self):
+ pass
+class Items(UserList):
+ def __init__(
+ self,
+ data: Any = None,
+ ):
+ if data is None:
+ data = list()
+ if not isinstance(data, list):
+ data = [data]
+ super().__init__(data)
+ def __len__(self):
+ return len(self.data)
+ def __getitem__(self, i):
+ return self.data[i]
+ def __delitem__(self, i):
+ del self.data[i]
+ def __setitem__(self, i, v):
+ self.data[i] = v
+ def insert(self, i, v):
+ self.data.insert(i, v)
+ def __str__(self):
+ return str(self.data)
+ def to_dct(self, target_keys: List[str] = None, ignored_keys: List[str] = None):
+ items = [item.to_dct(target_keys, ignored_keys) for item in self.data]
+ return items
+ def __iter__(self) -> Iterator:
+ return iter(self.data)
+ def preprocess(self):
+ pass

MuseV/MMCM/mmcm/data/media_map/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ from .media_map import MetaInfo, MediaMap, MetaInfoList

MuseV/MMCM/mmcm/data/media_map/media_map.py ADDED Viewed

	@@ -0,0 +1,393 @@

+from __future__ import annotations
+import bisect
+import logging
+from copy import deepcopy
+from functools import partial
+from typing import Any, Callable, Iterable, List, Union, Tuple, Dict
+import numpy as np
+from ..clip.clip_process import get_subseq_by_time
+from ..clip.clip_stat import stat_clipseq_duration
+from ..clip import Clip, ClipSeq, ClipIds, MatchedClipIds, MatchedClipIdsSeq
+from .media_map_process import get_sub_mediamap_by_time
+from ..emb import MediaMapEmb, H5pyMediaMapEmb
+from ..general.items import Item, Items
+from ...utils.data_util import pick_subdct
+from ...utils.util import convert_class_attr_to_dict, load_dct_from_file
+logger = logging.getLogger(__name__) # pylint: disable=invalid-name
+__all__ = ["MetaInfo", "MetaInfoList", "MediaMap", "MediaMapSeq"]
+class MetaInfo(Item):
+ """歌曲、视频等媒体文件级别的元信息"""
+ def __init__(
+ self,
+ mediaid=None,
+ media_name=None,
+ media_duration=None,
+ signature=None,
+ media_path: str = None,
+ media_map_path: str = None,
+ start: float = None,
+ end: float = None,
+ ext=None,
+ **kwargs,
+ ):
+ super(MetaInfo).__init__()
+ self.mediaid = mediaid
+ self.media_name = media_name
+ self.media_duration = media_duration
+ self.signature = signature
+ self.media_path = media_path
+ self.media_map_path = media_map_path
+ self.start = start
+ self.end = end
+ self.ext = ext
+ self.__dict__.update(**kwargs)
+ self.preprocess()
+ def preprocess(self):
+ self.set_start_end()
+ def set_start_end(self):
+ if self.start is None:
+ self.start = 0
+ elif self.start >= 0 and self.start <= 1:
+ self.start = self.start * self.media_duration
+ if self.end is None:
+ self.end = self.media_duration
+ elif self.end >= 0 and self.end <= 1:
+ self.end = self.end * self.media_duration
+class MetaInfoList(Items):
+ """媒体元数据列表，主要用于多歌曲、多视频剪辑时存储原单一媒体文件的元信息"""
+ def __init__(self, items: Union[MetaInfo, List[MetaInfo]] = None):
+ """
+ Args:
+ meta_info_list (list, optional): MetaInfo 列表. Defaults to None.
+ """
+ if items is None:
+ items = []
+ else:
+ items = items if isinstance(items, list) else [items]
+ super().__init__(items)
+ self.meta_info_list = self.items
+ if len(self.items) > 1:
+ self.reset()
+ def __len__(self):
+ return len(self.meta_info_list)
+ def __getitem__(self, i) -> MetaInfo:
+ return self.meta_info_list[i]
+ @property
+ def groupnum(self) -> int:
+ return len(self.meta_info_list)
+class MediaMap(object):
+ """媒体信息基类，也可以理解为音乐谱面、视觉谱面、音游谱面基类。主要有 MetaInfo、MetaInfoList、ClipSeq 属性。
+ 不同的媒体信息的 属性 类会有不同，所以在类变量里做定义。如有变化，可以定义自己的属性类。
+ """
+ def __init__(
+ self,
+ meta_info: MetaInfo = None,
+ clipseq: ClipSeq = None,
+ stageseq: ClipSeq = None,
+ frameseq: ClipSeq = None,
+ emb: H5pyMediaMapEmb = None,
+ **kwargs,
+ ):
+ """用于存储media的相关信息，media_info是json或直接字典
+ Args:
+ meta_info (MetaInfo): 当sub_meta_info不为None时, meta_info由sub_meta_info整合而成
+ sub_meta_info (None or [MetaInfo]): 当多个MediaInfo拼在一起时,用于保留子MediaInfo的信息
+ clipseq (ClipSeq): # 按照clipidx排序;
+ stageseq (ClipSeq): # 比 clipseq 更高纬度的片段划分，例如clips是镜头分割，stages是scenes分割；clips是关键点分割，stages是结构分割；
+ frameseq (ClipSeq): # 比 clipseq 更低纬度的片段划分
+ kwargs (dict, optional): 所有相关信息都会作为 meta_info 的补充，赋值到 meta_info 中
+ """
+ self.meta_info = meta_info
+ self.clipseq = clipseq
+ self.frameseq = frameseq
+ self.stageseq = stageseq
+ self.emb = emb
+ self.meta_info.__dict__.update(**kwargs)
+ self.preprocess()
+ def preprocess(
+ self,
+ ):
+ if (self.meta_info.start != 0 and self.meta_info.start is not None) or (
+ self.meta_info.end is not None and self.meta_info.end == 1
+ ):
+ self.drop_head_and_tail()
+ self.meta_info.preprocess()
+ if self.clipseq is not None:
+ self.clipseq.preprocess()
+ if self.frameseq is not None:
+ self.frameseq.preprocess()
+ if self.stageseq is not None:
+ self.stageseq.preprocess()
+ self.clip_start_idx = self.clipseq[0].clipid
+ self.clip_end_idx = self.clipseq[-1].clipid
+ def drop_head_and_tail(self) -> MediaMap:
+ self.clipseq = get_subseq_by_time(
+ self.clipseq,
+ start=self.meta_info.start,
+ end=self.meta_info.end,
+ duration=self.meta_info.media_duration,
+ )
+ if self.stageseq is not None:
+ self.stageseq = get_subseq_by_time(
+ self.clipseq,
+ start=self.meta_info.start,
+ end=self.meta_info.end,
+ duration=self.meta_info.media_duration,
+ )
+ def set_clip_value(self, k, v):
+ """为clipseq中的每个clip赋值，
+ Args:
+ k (str): Clip中字段名
+ v (any): Clip中字段值
+ """
+ self.clipseq.set_clip_value(k, v)
+ def spread_metainfo_2_clip(
+ self, target_keys: List = None, ignored_keys: List = None
+ ) -> None:
+ """将metainfo中的信息赋值到clip中，便于clip后面做相关处理。
+ Args:
+ target_keys ([str]): 待赋值的目标字段
+ """
+ dst = pick_subdct(
+ self.meta_info.__dict__, target_keys=target_keys, ignored_keys=ignored_keys
+ )
+ for k, v in dst.items():
+ self.set_clip_value(k, v)
+ def spread_parameters(self, target_keys: list, ignored_keys) -> None:
+ """元数据广播，将 media_info 的元数据广播到 clip 中，以及调用 clip 自己的参数传播。"""
+ self.spread_metainfo_2_clip(target_keys=target_keys, ignored_keys=ignored_keys)
+ for clip in self.clipseq:
+ clip.spread_parameters()
+ def stat(
+ self,
+ ):
+ """统计 media_info 相关信息，便于了解，目前统计内容有
+ 1. 片段长度
+ """
+ self.stat_clipseq_duration()
+ def stat_clipseq_duration(
+ self,
+ ):
+ hist, bin_edges = stat_clipseq_duration(self.clipseq)
+ print(self.media_name, "bin_edges", bin_edges)
+ print(self.media_name, "hist", hist)
+ def to_dct(self, target_keys: list = None, ignored_keys: list = None):
+ raise NotImplementedError
+ @property
+ def duration(
+ self,
+ ):
+ return self.clipseq.duration
+ @property
+ def mediaid(
+ self,
+ ):
+ return self.meta_info.mediaid
+ @property
+ def media_name(
+ self,
+ ):
+ return self.meta_info.media_name
+ @property
+ def duration_seq_emb(self):
+ return self.clipseq.duration_seq_emb
+ @property
+ def timestamp_seq_emb(self):
+ return self.clipseq.timestamp_seq_emb
+ @property
+ def rela_timestamp_seq_emb(self):
+ return self.clipseq.rela_timestamp_seq_emb
+ def get_emb(self, key, idx=None):
+ # TODO: 待修改为更通用的形式
+ if idx is None:
+ idx = range(self.clip_start_idx, self.clip_end_idx + 1)
+ elif isinstance(idx, int):
+ idx += self.clip_start_idx
+ elif isinstance(idx, Iterable):
+ idx = [x + self.clip_start_idx for x in idx]
+ else:
+ raise ValueError(
+ f"idx only support None, int, Iterable, but given {idx},type is {type(idx)}"
+ )
+ return self.emb.get_value(key, idx=idx)
+ def get_meta_info_attr(self, key: str) -> Any:
+ return getattr(self.meta_info, key)
+ @classmethod
+ def from_json_path(
+ cls, path: Dict, emb_path: str, media_path: str = None, **kwargs
+ ) -> MediaMap:
+ media_map = load_dct_from_file(path)
+ emb = H5pyMediaMapEmb(emb_path)
+ return cls.from_data(media_map, emb=emb, media_path=media_path, **kwargs)
+class MediaMapSeq(Items):
+ def __init__(self, maps: List[MediaMap]) -> None:
+ super().__init__(maps)
+ self.maps = self.data
+ self.preprocess()
+ self.each_map_clipseq_num = [len(m.clipseq) for m in self.maps]
+ self.each_map_clipseq_num_cumsum = np.cumsum([0] + self.each_map_clipseq_num)
+ @property
+ def clipseq(self):
+ clipseq = []
+ for m in self.maps:
+ clipseq.extend(m.clipseq.data)
+ return type(self.maps[0].clipseq)(clipseq)
+ @property
+ def stagesseq(self):
+ stagesseq = []
+ for m in self.maps:
+ stagesseq.extend(m.stagesseq.data)
+ return type(self.maps[0].stagesseq)(stagesseq)
+ @property
+ def frameseq(self):
+ frameseq = []
+ for m in self.maps:
+ frameseq.extend(m.frameseq.data)
+ return type(self.maps[0].frameseq)(frameseq)
+ def preprocess(self):
+ for m in self.maps:
+ m.preprocess()
+ def _combine_str(
+ self,
+ attrs: List[str],
+ sep: str = "|",
+ single_maxlen: int = 10,
+ total_max_length: int = 60,
+ ) -> str:
+ return sep.join([str(attr)[:single_maxlen] for attr in attrs])[
+ :total_max_length
+ ]
+ def get_meta_info_attr(self, key: str, func: Callable) -> Any:
+ attrs = [m.get_meta_info_attr(key) for m in self.maps]
+ return func(attrs)
+ @property
+ def mediaid(self) -> str:
+ return self.get_meta_info_attr(key="mediaid", func=self._combine_str)
+ @property
+ def media_name(self) -> str:
+ return self.get_meta_info_attr(key="media_name", func=self._combine_str)
+ @property
+ def duration(self) -> float:
+ return sum([m.duration for m in self.maps])
+ @property
+ def media_duration(self) -> float:
+ return self.get_meta_info_attr(key="media_duration", func=sum)
+ @classmethod
+ def from_json_paths(
+ cls,
+ media_map_class: MediaMap,
+ media_paths: str,
+ media_map_paths: str,
+ emb_paths: str,
+ **kwargs,
+ ) -> MediaMapSeq:
+ map_seq = [
+ media_map_class.from_json_path(
+ path=media_map_paths[i],
+ emb_path=emb_paths[i],
+ media_path=media_paths[i],
+ **kwargs,
+ )
+ for i in range(len(media_map_paths))
+ ]
+ return cls(map_seq)
+ # TODO: implement mapseq stat func
+ def stat(self):
+ for m in self.maps:
+ m.stat()
+ def _combine_embs(self, embs):
+ return np.concatenate(embs, axis=0)
+ @property
+ def duration_seq_emb(self):
+ embs = [m.duration_seq_emb for m in self.maps]
+ return self._combine_embs(embs)
+ @property
+ def timestamp_seq_emb(self):
+ embs = [m.timestamp_seq_emb for m in self.maps]
+ return self._combine_embs(embs)
+ @property
+ def rela_timestamp_seq_emb(self):
+ embs = [m.rela_timestamp_seq_emb for m in self.maps]
+ return self._combine_embs(embs)
+ def clip_idx_2_map_idx(self, idx):
+ target_map_idx = bisect.bisect_right(self.each_map_clipseq_num_cumsum, idx)
+ target_map_idx = min(max(0, target_map_idx - 1), len(self.maps) - 1)
+ target_map_clip_idx = idx - self.each_map_clipseq_num_cumsum[target_map_idx]
+ return target_map_idx, target_map_clip_idx
+ def get_emb(self, key: str, idx: Union[None, int, List[int]] = None) -> np.array:
+ if idx is None:
+ embs = [m.get_emb(key, idx=idx) for m in self.maps]
+ else:
+ if not isinstance(idx, list):
+ idx = [idx]
+ embs = []
+ for c_idx in idx:
+ target_map_idx, target_map_clip_idx = self.clip_idx_2_map_idx(c_idx)
+ embs.append(
+ self.maps[target_map_idx].get_emb(key, int(target_map_clip_idx))
+ )
+ if len(embs) == 1:
+ return embs[0]
+ else:
+ return self._combine_embs(embs)

MuseV/MMCM/mmcm/data/media_map/media_map_process.py ADDED Viewed

	@@ -0,0 +1,72 @@

+from __future__ import annotations
+from typing import List, Union, TYPE_CHECKING
+from ..clip.clip_process import (
+ get_subseq_by_time,
+ find_time_by_stage,
+)
+if TYPE_CHECKING:
+ from ..media_map.media_map import MediaMap
+ from ..clip import Clip, ClipSeq
+__all__ =[
+ "get_sub_mediamap_by_clip_idx",
+ "get_sub_mediamap_by_stage",
+ "get_sub_mediamap_by_time",
+]
+def get_sub_mediamap_by_time(media_map:MediaMap, start: int=0, end:int=1, eps=1e-2) -> MediaMap:
+ """获取子片段序列，同时更新media_map中的相关信息
+ Args:
+ media_map (MediaInfo): _description_
+ start (float): 开始时间
+ end (float): 结束时间
+ Returns:
+ _type_: _description_
+ """
+ if start < 1:
+ start = media_map.duration * start
+ if end is None:
+ end = media_map.meta_info.media_duration
+ elif end <= 1:
+ end = media_map.duration * end
+ media_map.meta_info.start = start
+ media_map.meta_info.end = end
+ media_map.clipseq = get_subseq_by_time(
+ media_map.clipseq,
+ start=start,
+ end=end,
+ )
+ if media_map.stageseq is not None:
+ media_map.stageseq = get_subseq_by_time(media_map.stageseq, start=start, end=end)
+ return media_map
+def get_sub_mediamap_by_clip_idx(media_map: MediaMap, start: int=None, end: int=None) -> MediaMap:
+ """不仅获取子片段序列，还要更新media_map中的相关信息
+ Args:
+ media_map (_type_): _description_
+ """
+ if start is None:
+ start = 0
+ if end is None:
+ end = -1
+ start = media_map.clipseq[start].time_start
+ end = media_map.clipseq[end].time_end
+ media_map = get_sub_mediamap_by_time(media_map=media_map, start=start, end=end)
+ return media_map
+def get_sub_mediamap_by_stage(media_map: MediaMap, stages: Union[str, List[str]]) -> MediaMap:
+ if isinstance(stages, List):
+ stages = [stages]
+ start, _ = find_time_by_stage(media_map.stageseq, stages[0])
+ _, end = find_time_by_stage(media_map.stageseq, stages[-1])
+ media_map = get_sub_mediamap_by_time(media_map=media_map, start=start, end=end)
+ return media_map

MuseV/MMCM/mmcm/music/__init__.py ADDED Viewed

	@@ -0,0 +1,6 @@

+from .music_map.music_map import MusicMap, MusicMapSeq
+from .music_map.music_clip import MusicClip, MusicClipSeq
+from .music_map.meta_info import MusicMetaInfo
+from .music_map.load_music_map import load_music_map
+from .utils.path_util import get_audio_path_dct

MuseV/MMCM/mmcm/music/music_map/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/music/music_map/beat_map.py ADDED Viewed

	@@ -0,0 +1,82 @@

+import numpy as np
+from librosa.core.audio import get_duration
+from ...data.clip.clip_process import insert_endclip, insert_startclip
+from .clip_process import filter_clipseq_target_point
+from .music_clip import MusicClip, MusicClipSeq
+def beatnet2TMEType(beat: np.array, duration: float) -> MusicClipSeq:
+ """conver beatnet beat to tme beat type
+ Args:
+ beat (np.array): Nx2,
+ 1st column is time,
+ 2rd is type,
+ 0, end point
+ 1, strong beat
+ 2,3,4 weak beat
+ -1 lyric
+ duration (float): audio time length
+ Returns:
+ MusicClipSeq:
+ """
+ n = len(beat)
+ beat = np.insert(beat, 0, 0, axis=0)
+ beat = np.insert(beat, n + 1, [duration, 0], axis=0)
+ clips = []
+ for i in range(n + 1):
+ beat_type = int(beat[i + 1, 1])
+ clip = MusicClip(
+ time_start=beat[i, 0], # 开始时间
+ duration=round(beat[i + 1, 0] - beat[i, 0], 3), # 片段持续时间
+ clipid=i, # 片段序号，
+ timepoint_type=beat_type,
+ )
+ clips.append(clip)
+ clipseq = MusicClipSeq(clips=clips)
+ return clipseq
+def generate_beatseq_with_beatnet(audio_path: str) -> np.array:
+ """使用beatnet生成beat序列
+ Args:
+ audio_path (str):
+ Returns:
+ np.array: beat序列 Nx2,
+ 1st column is time,
+ 2rd is type,
+ 0, end point
+ 1, strong beat
+ 2,3,4 weak beat
+ """
+ from BeatNet.BeatNet import BeatNet
+ estimator = BeatNet(1, mode="offline", inference_model="DBN", plot=[], thread=False)
+ output = estimator.process(audio_path=audio_path)
+ return output
+def generate_music_map_with_beatnet(
+ audio_path: str, target: list = [0, 1]
+) -> MusicClipSeq:
+ """使用beatnet生成beat MusicClipseq
+ Args:
+ audio_path (str):
+ target (list, optional): 只保留相应的拍点. Defaults to [0, 1].
+ Returns:
+ MusicClipSeq: 返回的beat序列
+ beat: np.array, 原始的beat检测结果
+ """
+ output = generate_beatseq_with_beatnet(audio_path)
+ duration = get_duration(filename=audio_path)
+ clipseq = beatnet2TMEType(output, duration)
+ clipseq = insert_startclip(clipseq)
+ clipseq = insert_endclip(clipseq, duration)
+ clipseq = filter_clipseq_target_point(clipseq, target=target)
+ return clipseq, output

MuseV/MMCM/mmcm/music/music_map/clip_process.py ADDED Viewed

	@@ -0,0 +1,196 @@

+from __future__ import annotations
+from typing import TYPE_CHECKING, Dict, List
+import numpy as np
+from ...data.clip.clip_process import find_idx_by_time, reset_clipseq_id
+from ...data.clip.clip_fusion import fuse_clips
+from ...utils.util import merge_list_continuous_same_element
+if TYPE_CHECKING:
+ from .music_clip import MusicClip, MusicClipSeq
+ from .music_map import MusicMap, MusicMapSeq
+# TODO: 待和clip操作做整合
+def music_clip_is_short(clip: MusicClip, th: float = 3) -> bool:
+ """判断音乐片段是否过短
+ Args:
+ clip (MusicClip): 待判断的音乐片段
+ th (float, optional): 短篇的参数. Defaults to 3.
+ Returns:
+ bool: 是或不是 短片段
+ """
+ if clip.duration < th:
+ return False
+ else:
+ return True
+def music_clip_timepoint_is_target(clip: MusicClip, target: list = [-1, 1, 0]) -> bool:
+ """音乐片段的关键点类型是否是目标关键点
+ 关键点类型暂时参考：VideoMashup/videomashup/data_structure/music_data_structure.py
+ Args:
+ clip (MusicClip): 待判断的音乐片段
+ target (list, optional): 目标关键点类别. Defaults to [-1, 1, 0].
+ Returns:
+ bool: 是还是不是
+ """
+ timepoint = clip.timepoint_type
+ if isinstance(timepoint, int):
+ timepoint = {timepoint}
+ else:
+ timepoint = {int(x) for x in timepoint.split("_")}
+ if timepoint & set(target):
+ return True
+ else:
+ return False
+def filter_clipseq_target_point(
+ clipseq: MusicClipSeq, target: list = [-1, 1, 0]
+) -> MusicClipSeq:
+ """删除目标关键点之外的点，对相应的片段做融合
+ Args:
+ clipseq (MusicClipSeq): 待处理的音乐片段序列
+ target (list, optional): 保留的目标关键点. Defaults to [-1, 1, 0].
+ Returns:
+ MusicClipSeq: 处理后的音乐片段序列
+ """
+ n_clipseq = len(clipseq)
+ if n_clipseq == 1:
+ return clipseq
+ newclipseq = []
+ start_clip = clipseq[0]
+ if music_clip_timepoint_is_target(start_clip, target=target):
+ has_start_clip = True
+ else:
+ has_start_clip = False
+ i = 1
+ while i <= n_clipseq - 1:
+ clip = clipseq[i]
+ start_clip_is_target = music_clip_timepoint_is_target(start_clip, target=target)
+ next_clip_is_target = music_clip_timepoint_is_target(clip, target=target)
+ # logger.debug("filter_clipseq_target_point: i={},start={}, clip={}".format(i, start_clip["timepoint_type"], clip["timepoint_type"]))
+ # logger.debug("start_clip_is_target: {}, next_clip_is_target {}".format(start_clip_is_target, next_clip_is_target))
+ if not has_start_clip:
+ start_clip = clip
+ has_start_clip = next_clip_is_target
+ else:
+ if start_clip_is_target:
+ has_start_clip = True
+ if next_clip_is_target:
+ newclipseq.append(start_clip)
+ start_clip = clip
+ if i == n_clipseq - 1:
+ newclipseq.append(clip)
+ else:
+ start_clip = fuse_clips(start_clip, clip)
+ if i == n_clipseq - 1:
+ newclipseq.append(start_clip)
+ # logger.debug("filter_clipseq_target_point: fuse {}, {}".format(i, clip["timepoint_type"]))
+ else:
+ start_clip = clip
+ i += 1
+ newclipseq = reset_clipseq_id(newclipseq)
+ return newclipseq
+def merge_musicclip_into_clipseq(
+ clip: MusicClipSeq, clipseq: MusicClip, th: float = 1
+) -> MusicClipSeq:
+ """给clipseq插入一个新的音乐片段，会根据插入后片段是否过短来判断。
+ Args:
+ clip (MusicClipSeq): 要插入的音乐片段
+ clipseq (MusicClip): 待插入的音乐片段序列
+ th (float, optional): 插入后如果受影响的片段长度过短，则放弃插入. Defaults to 1.
+ Returns:
+ MusicClipSeq: _description_
+ """
+ n_clipseq = len(clipseq)
+ clip_time = clip.time_start
+ idx = find_idx_by_time(clipseq, clip_time)
+ last_clip_time_start = clipseq[idx].time_start
+ next_clip_time_start = clipseq[idx].time_start + clipseq[idx].duration
+ last_clip_time_delta = clip_time - last_clip_time_start
+ clip_duration = next_clip_time_start - clip_time
+ # TODO: 副歌片段改变th参数来提升音符密度，暂不使用，等待音游谱面
+ # TODO: 待抽离独立的业务逻辑为单独的函数
+ # 只针对副歌片段插入关键点
+ if clipseq[idx].text is None or (
+ clipseq[idx].text is not None
+ and clipseq[idx].stage is not None
+ and "C" in clipseq[idx].stage
+ ):
+ if (last_clip_time_delta > th) and (clip_duration > th):
+ clip.duration = clip_duration
+ clipseq[idx].duration = last_clip_time_delta
+ clipseq.insert(idx + 1, clip)
+ clipseq = reset_clipseq_id(clipseq)
+ return clipseq
+def merge_music_clipseq(clipseq1: MusicClipSeq, clipseq2: MusicClipSeq) -> MusicClipSeq:
+ """将片段序列clipseq2融合到音乐片段序列clipseq1中。融合过程也会判断新片段长度。
+ Args:
+ clipseq1 (MusicClipSeq): 要融合的目标音乐片段序列
+ clipseq2 (MusicClipSeq): 待融合的音乐片段序列
+ Returns:
+ MusicClipSeq: 融合后的音乐片段序列
+ """
+ while len(clipseq2) > 0:
+ clip = clipseq2[0]
+ clipseq1 = merge_musicclip_into_clipseq(clip, clipseq1)
+ del clipseq2[0]
+ return clipseq1
+def merge_lyricseq_beatseq(
+ lyric_clipseq: MusicClipSeq, beat_clipseq: MusicClipSeq
+) -> MusicClipSeq:
+ """将beat序列融合到歌词序列中
+ Args:
+ lyric_clipseq (MusicClipSeq): 歌词序列
+ beat_clipseq (MusicClipSeq): beat序列
+ Returns:
+ MusicClipSeq: 融合后的音乐片段序列
+ """
+ newclipseq = merge_music_clipseq(lyric_clipseq, beat_clipseq)
+ # for i, clip in enumerate(newclipseq):
+ # logger.debug("i={}, time_start={}, duration={}".format(i, clip.time_start, clip.duration))
+ return newclipseq
+def get_stageseq_from_clipseq(clipseq: MusicClipSeq) -> List[Dict]:
+ """对clip.stage做近邻融合，返回总时间
+ Returns:
+ List[Dict]: 根据音乐结构进行分割的片段序列
+ """
+ stages = [clip.stage for clip in clipseq]
+ merge_stages_idx = merge_list_continuous_same_element(stages)
+ merge_stages = []
+ for n, stages_idx in enumerate(merge_stages_idx):
+ dct = {
+ "clipid": n,
+ "time_start": clipseq[stages_idx["start"]].time_start,
+ "time_end": clipseq[stages_idx["end"]].time_end,
+ "stage": stages_idx["element"],
+ "original_clipid": list(
+ range(stages_idx["start"], stages_idx["end"] + 1)
+ ), # mss都是左闭、 右闭的方式
+ }
+ dct["duration"] = dct["time_end"] - dct["time_start"]
+ merge_stages.append(dct)
+ return merge_stages

MuseV/MMCM/mmcm/music/music_map/convert_type.py ADDED Viewed

	@@ -0,0 +1,57 @@

+from ...data.clip.clip_process import (
+ insert_startclip,
+ insert_endclip,
+ reset_clipseq_id,
+)
+from .music_clip import MusicClip, MusicClipSeq
+def read_osu_hitobjs(path: str) -> list:
+ """读取osu的音游谱面
+ Args:
+ path (str): 谱面低质
+ Returns:
+ list: 只包含HitObjects的行字符串信息
+ """
+ lines = []
+ is_hit_info_start = False
+ with open(path, "r") as f:
+ for line in f:
+ if is_hit_info_start:
+ lines.append(line.strip())
+ if "[HitObjects]" in line:
+ is_hit_info_start = True
+ return lines
+def osu2itech(src: list, duration: float = None) -> MusicClipSeq:
+ """将osu的音游谱面转换为我们的目标格式
+ Args:
+ src (list): 音游谱面路径或者是读取的目标行字符串列表
+ duration (float, optional): 歌曲长度. Defaults to None.
+ Returns:
+ MusicClipSeq: 音乐片段序列
+ """
+ if isinstance(src, str):
+ src = read_osu_hitobjs(src)
+ timepoints = [float(line.split(",")[2]) for line in src]
+ clips = []
+ for i in range(len(timepoints) - 1):
+ clip = MusicClip(
+ time_start=round(timepoints[i] / 1000, 3),
+ timepoint_type=0,
+ duration=round((timepoints[i + 1] - timepoints[i]) / 1000, 3),
+ clipid=i,
+ )
+ clips.append(clip)
+ if len(clips) > 0:
+ clips = insert_startclip(clips)
+ if duration is not None:
+ clips = insert_endclip(clips, duration=duration)
+ clips = reset_clipseq_id(clips)
+ return MusicClipSeq(clips)

MuseV/MMCM/mmcm/music/music_map/load_music_map.py ADDED Viewed

	@@ -0,0 +1,38 @@

+from typing import List
+from .music_map import MusicMap, MusicMapSeq
+def load_music_map(
+ music_map_paths,
+ music_paths,
+ emb_paths,
+ start: float=None,
+ end: None=None,
+ target_stages: List[str] = None,
+ **kwargs,
+):
+ """读取视频谱面，转化成MusicInfo。当 musicinfo_path_lst 为列表时，表示多歌曲
+ Args:
+ musicinfo_path_lst (str or [str]): 视频谱面路径文件列表
+ music_path_lst (str or [str]): 视频文件路径文件列表，须与musicinfo_path_lst等长度
+ Returns:
+ MusicInfo: 视频谱面信息
+ """
+ dct ={
+ "start": start,
+ "end": end,
+ "target_stages": target_stages,
+ }
+ if isinstance(music_map_paths, list):
+ music_map = MusicMapSeq.from_json_paths(media_map_class=MusicMapSeq, media_paths=music_paths, media_map_paths=music_map_paths, emb_paths=emb_paths, **dct, **kwargs)
+ if len(music_map) == 1:
+ music_map = music_map[0]
+ else:
+ music_map = MusicMap.from_json_path(path=music_map_paths, emb_path=emb_paths, media_path=music_paths, **dct, **kwargs)
+ return music_map

MuseV/MMCM/mmcm/music/music_map/lyric_map.py ADDED Viewed

	@@ -0,0 +1,149 @@

+import numpy as np
+from sklearn.preprocessing import normalize, minmax_scale
+from scipy.signal import savgol_filter
+# TODO：待更新音乐谱面的类信息
+from ...data.clip.clip_process import (
+ complete_clipseq,
+ find_idx_by_clip,
+ insert_endclip,
+ insert_startclip,
+ reset_clipseq_id,
+)
+from .music_clip import Clip, ClipSeq
+from .music_clip import MusicClipSeq
+from .music_map import MusicMap
+def generate_lyric_map(
+ path: str, duration: float = None, gap_th: float = 2
+) -> MusicClipSeq:
+ """从歌词文件中生成音乐谱面
+ Args:
+ path (str): 歌词文件路径
+ duration (float, optional): 歌词对应音频的总时长. Defaults to None.
+ gap_th (float, optional): 歌词中间的空白部分是否融合到上一个片段中. Defaults to 3.
+ Returns:
+ MusicClipSeq: 以歌词文件生成的音乐谱面
+ """
+ from ..music_map.lyric_process import lyricfile2musicinfo
+ lyric_info = lyricfile2musicinfo(path)
+ lyric_info = MusicMap(lyric_info, duration=duration)
+ clipseq = lyric_info.clipseq
+ lyric_info.meta_info.duration = duration
+ # set part of nonlyric as clip whose timepoint is 0
+ for i in range(len(clipseq)):
+ clipseq[i].timepoint_type = -1
+ lyric_info.clipseq = complete_clipseq(
+ clipseq=clipseq, duration=duration, gap_th=gap_th
+ )
+ return lyric_info
+def insert_field_2_clipseq(clipseq: ClipSeq, reference: ClipSeq, field: str) -> ClipSeq:
+ """将reference中每个clip的字段信息根据赋给clipseq中最近的clip
+ Args:
+ clipseq (ClipSeq): 目标clip序列
+ reference (ClipSeq): 参考clip序列
+ field (str): 目标字段
+ Returns:
+ ClipSeq: 更新目标字段新值后的clip序列
+ """
+ for i, clip in enumerate(clipseq):
+ idx = find_idx_by_clip(reference, clip=clip)
+ if idx is not None:
+ if getattr(reference[idx], field) is not None:
+ clipseq[i].__dict__[field] = getattr(reference[idx], field)
+ return clipseq
+def insert_rythm_2_clipseq(clipseq, reference):
+ """参考MSS字段的结构信息设置rythm信息。目前策略非常简单，主歌(Vx)0.25，副歌(Cx)0.75，其他为None
+ Args:
+ clipseq (ClipSeq): 目标clip序列，设置rythm字段
+ reference (ClipSeq): 参考clip序列，参考stage字段
+ Returns:
+ ClipSeq: 更新rythm字段新值后的clip序列
+ """
+ def stage2rythm(stage):
+ if "V" in stage:
+ return 0.25
+ elif "C" in stage:
+ return 0.75
+ else:
+ return None
+ for i, clip in enumerate(clipseq):
+ idx = find_idx_by_clip(reference, clip=clip)
+ if idx is not None:
+ if reference[idx].rythm is not None:
+ clipseq[i].rythm = stage2rythm(reference[idx].stage)
+ return clipseq
+def insert_rythm_from_clip(clipseq: MusicClipSeq, beat: np.array) -> MusicClipSeq:
+ """给MusicClipSeq中的每个Clip新增节奏信息。目前使用
+ 1. 单位时间内的歌词数量特征, 使用 min-max 归一化到 0 - 1 之间
+ 2. 单位时间内的关键点数量，目前使用beatnet,使用 min-max 归一化到 0 - 1 之间
+ 3. 对1、2中的特征相加，并根据歌曲结构不同进行加权
+ Args:
+ clipseq (MusicClipSeq): 待处理的 MusicClipSeq
+ beat (np.array): beat检测结果，Nx2,，用于结算单位时间内的关键点数。
+ 1st column is time,
+ 2rd is type,
+ 0, end point
+ 1, strong beat
+ 2,3,4 weak beat
+ Returns:
+ MusicClipSeq: 新增 rythm 的 MusicClipSeq
+ """
+ mss_cofficient = {
+ "intro": 1.0,
+ "bridge": 1.0,
+ "end": 0.8,
+ "VA": 1.0,
+ "VB": 1.0,
+ "CA": 1.6,
+ "CB": 1.6,
+ }
+ # text_num_per_second
+ text_num_per_second_lst = [clip.tnps for clip in clipseq if clip.tnps != 0]
+ common_tnps = np.min(text_num_per_second_lst)
+ tnps = np.array([clip.tnps if clip.tnps != 0 else common_tnps for clip in clipseq])
+ tnps = minmax_scale(tnps)
+ # beat point _num_per_second
+ beat_pnps = np.zeros(len(clipseq))
+ for i, clip in enumerate(clipseq):
+ time_start = clip.time_start
+ time_end = clip.time_end
+ target_beat = beat[(beat[:, 0] >= time_start) & (beat[:, 0] < time_end)]
+ beat_pnps[i] = len(target_beat) / clip.duration
+ beat_pnps = minmax_scale(beat_pnps)
+ # cofficient
+ cofficients = np.array(
+ [
+ mss_cofficient[clip.stage]
+ if clip.stage in mss_cofficient and clip.stage is not None
+ else 1.0
+ for clip in clipseq
+ ]
+ )
+ rythm = cofficients * (tnps + beat_pnps)
+ rythm = minmax_scale(rythm)
+ rythm = savgol_filter(rythm, window_length=5, polyorder=3)
+ rythm = minmax_scale(rythm)
+ for i, clip in enumerate(clipseq):
+ clip.dynamic = rythm[i]
+ return clipseq

MuseV/MMCM/mmcm/music/music_map/lyric_process.py ADDED Viewed

	@@ -0,0 +1,515 @@

+from genericpath import isfile
+import re
+import os
+from ...text.utils.read_text import read_xml2json
+# 一个正则表达式非常好用的网站
+# https://regex101.com/r/cW8jA6/2
+CHINESE_PATTERN = r"[\u4e00-\u9fff]+"
+NOT_CHINESE_PATTERN = r"[^\u4e00-\u9fa5]"
+ENGLISH_CHARACHTER_PATTERN = r"[a-zA-Z]+"
+WORD_PATTERN = r"\w+" # equal to [a-zA-Z0-9_].
+NOT_WORD_PATTERN = r"\W+"
+def has_target_string(lyric: str, pattern: str) -> bool:
+ """本句歌词是否有目标字符串
+ Args:
+ lyric (str):
+ pattern (str): 目标字符串的正则表达式式patteren
+ Returns:
+ bool: 有没有目标字符串
+ """
+ matched = re.findall(pattern, lyric)
+ flag = len(matched) > 0
+ return flag
+def has_chinese_char(lyric: str) -> bool:
+ """是否有中文字符
+ Args:
+ lyric (str):
+ Returns:
+ bool: 是否有中文字符
+ """
+ return has_target_string(lyric, CHINESE_PATTERN)
+def has_non_chinese_char(lyric: str) -> bool:
+ """是否有非中文字符，参考https://git.woa.com/innovative_tech/CopyrightGroup/LyricTools/blob/master/lyric_tools/dataProcess.py#L53
+ Args:
+ lyric (str):
+ Returns:
+ bool: 是否有中文字符
+ """
+ return has_target_string(lyric, NOT_CHINESE_PATTERN)
+def has_english_alphabet_char(lyric: str) -> bool:
+ """是否有英文字母表字符
+ Args:
+ lyric (str):
+ Returns:
+ bool:
+ """
+ return has_target_string(lyric, ENGLISH_CHARACHTER_PATTERN)
+def check_is_lyric_row(lyric: str) -> bool:
+ """该字符串是否是歌词
+ Args:
+ lyric (str): 待判断的字符串
+ Returns:
+ bool: 该字符串是否是歌词
+ """
+ is_not_lyric = [
+ re.search(r"\[ti[:：]?", lyric),
+ re.search(r"\[ar[:：]?", lyric),
+ re.search(r"\[al[:：]?", lyric),
+ re.search(r"\[by[:：]?", lyric),
+ re.search(r"\[offset[:：]?", lyric),
+ re.search(r"词[:：]?\(\d+,\d+\)[:：]?", lyric),
+ re.search(r"曲[:：]?\(\d+,\d+\)[:：]?", lyric),
+ re.search(r"作\(\d+,\d+\)词[:：]?", lyric),
+ re.search(r"作\(\d+,\d+\)曲[:：]?", lyric),
+ re.search(r"演\(\d+,\d+\)唱[:：]?", lyric),
+ re.search(r"编\(\d+,\d+\)曲[:：]?", lyric),
+ re.search(r"吉\(\d+,\d+\)他[:：]", lyric),
+ re.search(r"人\(\d+,\d+\)声\(\d+,\d+\)录\(\d+,\d+\)音\(\d+,\d+\)师[:：]?", lyric),
+ re.search(r"人\(\d+,\d+\)声\(\d+,\d+\)录\(\d+,\d+\)音\(\d+,\d+\)棚[:：]?", lyric),
+ re.search(r"Vocal\s+\(\d+,\d+\)edite[:：]?", lyric),
+ re.search(r"混\(\d+,\d+\)音\(\d+,\d+\)/\(\d+,\d+\)母\(\d+,\d+\)带[:：]?", lyric),
+ re.search(r"混\(\d+,\d+\)音", lyric),
+ re.search(r"和\(\d+,\d+\)声\(\d+,\d+\)编\(\d+,\d+\)写[:：]?", lyric),
+ re.search(
+ r"词\(\d+,\d+\)版\(\d+,\d+\)权\(\d+,\d+\)管\(\d+,\d+\)理\(\d+,\d+\)方[:：]?", lyric
+ ),
+ re.search(
+ r"曲\(\d+,\d+\)版\(\d+,\d+\)权\(\d+,\d+\)管\(\d+,\d+\)理\(\d+,\d+\)方[:：]?", lyric
+ ),
+ re.search(r"联\(\d+,\d+\)合\(\d+,\d+\)出\(\d+,\d+\)品[:：]?", lyric),
+ re.search(r"录\(\d+,\d+\)音\(\d+,\d+\)作\(\d+,\d+\)品", lyric),
+ re.search(
+ r"录\(\d+,\d+\)音\(\d+,\d+\)作\(\d+,\d+\)品\(\d+,\d+\)监\(\d+,\d+\)制[:：]?", lyric
+ ),
+ re.search(r"制\(\d+,\d+\)作\(\d+,\d+\)人[:：]?", lyric),
+ re.search(r"制\(\d+,\d+\)作\(\d+,\d+\)人[:：]?", lyric),
+ re.search(r"不\(\d+,\d+\)得\(\d+,\d+\)翻\(\d+,\d+\)唱", lyric),
+ re.search(r"未\(\d+,\d+\)经\(\d+,\d+\)许\(\d+,\d+\)可", lyric),
+ re.search(r"酷\(\d+,\d+\)狗\(\d+,\d+\)音\(\d+,\d+\)乐", lyric),
+ re.search(r"[:：]", lyric),
+ ]
+ is_not_lyric = [x is not None for x in is_not_lyric]
+ is_not_lyric = any(is_not_lyric)
+ is_lyric = not is_not_lyric
+ return is_lyric
+def lyric2clip(lyric: str) -> dict:
+ """convert a line of lyric into a clip
+ Clip定义可以参考 https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/media/clip.py
+ Args:
+ lyric (str): _description_
+ Returns:
+ dict: 转化成Clip 字典
+ """
+ time_str_groups = re.findall(r"\d+,\d+", lyric)
+ line_time_start = round(int(time_str_groups[0].split(",")[0]) / 1000, 3)
+ line_duration = round(int(time_str_groups[0].split(",")[-1]) / 1000, 3)
+ line_end_time = line_time_start + line_duration
+ last_word_time_start = round(int(time_str_groups[-1].split(",")[0]) / 1000, 3)
+ last_word_duration = round(int(time_str_groups[-1].split(",")[-1]) / 1000, 3)
+ last_word_end_time = last_word_time_start + last_word_duration
+ actual_duration = min(line_end_time, last_word_end_time) - line_time_start
+ lyric = re.sub(r"\[\d+,\d+\]", "", lyric)
+ # by yuuhong: 把每个字的起始时间点、结束时间点、具体的字拆分出来
+ words_with_timestamp = get_words_with_timestamp(lyric)
+ lyric = re.sub(r"\(\d+,\d+\)", "", lyric)
+ dct = {
+ "time_start": line_time_start,
+ "duration": actual_duration,
+ "text": lyric,
+ "original_text": lyric,
+ "timepoint_type": -1,
+ "clips": words_with_timestamp,
+ }
+ return dct
+# by yuuhong
+# 把一句QRC中的每个字拆分出来
+# lyric示例：漫(17316,178)步(17494,174)走(17668,193)在(17861,183) (18044,0)莎(18044,153)玛(18197,159)丽(18356,176)丹(18532,200)
+def get_words_with_timestamp(lyric):
+ words_with_timestamp = []
+ elements = lyric.split(")")
+ for element in elements:
+ sub_elements = element.split("(")
+ if len(sub_elements) != 2:
+ continue
+ text = sub_elements[0]
+ timestamp = sub_elements[1]
+ if re.match(r"\d+,\d+", timestamp):
+ # 有效时间戳
+ time_start_str = timestamp.split(",")[0]
+ time_start = round(int(time_start_str) / 1000, 3)
+ duration_str = timestamp.split(",")[1]
+ duration = round(int(duration_str) / 1000, 3)
+ clip = {"text": text, "time_start": time_start, "duration": duration}
+ words_with_timestamp.append(clip)
+ return words_with_timestamp
+def lyric2clips(lyric: str, th: float = 0.75) -> list:
+ """将一句歌词转换为至少1个的clip。拆分主要是针对中文空格拆分，如果拆分后片段过短，也会整句处理。
+ Args:
+ lyric (str): such as [173247,3275]去(173247,403)吗(173649,677) 配(174326,189)吗(174516,593) 这(175108,279)
+ th (float, optional): 后面如果拆分后片段过短，也会整句处理. Defaults to 1.0.
+ Returns:
+ list: 歌词Clip序列
+ """
+ # 目前只对中文的一句歌词按照空格拆分，如果是英文空格则整句处理
+ # 后面如果拆分后片段过短，也会整句处理
+ if has_english_alphabet_char(lyric):
+ return [lyric2clip(lyric)]
+ splited_lyric = lyric.split(" ")
+ if len(splited_lyric) == 1:
+ return [lyric2clip(splited_lyric[0])]
+ line_time_str, sub_lyric = re.split(r"]", splited_lyric[0])
+ line_time_groups = re.findall(r"\d+,\d+", line_time_str)
+ line_time_start = round(int(line_time_groups[0].split(",")[0]) / 1000, 3)
+ line_duration = round(int(line_time_groups[0].split(",")[-1]) / 1000, 3)
+ splited_lyric[0] = sub_lyric
+ # 歌词xml都是歌词仅跟着时间，如果有空格 空格也应该是在时间后面，但有时候空格却在字后面、在时间前，因此需要修正
+ # 错误的：[173247,3275]去(173247,403)吗 (173649,677)配(174326,189)吗 (174516,593)这(175108,279)
+ # 错误的：[46122,2082]以(46122,213)身(46335,260)淬(46595,209)炼(46804,268)天(47072,250)地(47322,370)造(47692,341)化 (48033,172)
+ # 修正成：[173247,3275]去(173247,403)吗(173649,677) 配(174326,189)吗(174516,593) 这(175108,279)
+ for i in range(len(splited_lyric)):
+ if splited_lyric[i] == "":
+ del splited_lyric[i]
+ break
+ if splited_lyric[i][-1] != ")":
+ next_lyric_time_start = re.search(
+ r"\(\d+,\d+\)", splited_lyric[i + 1]
+ ).group(0)
+ splited_lyric[i] += next_lyric_time_start
+ splited_lyric[i + 1] = re.sub(
+ next_lyric_time_start, "", splited_lyric[i + 1]
+ )
+ splited_lyric[i + 1] = re.sub("\(\)", "", splited_lyric[i + 1])
+ lyric_text = re.sub(r"\[\d+,\d+\]", "", lyric)
+ lyric_text = re.sub(r"\(\d+,\d+\)", "", lyric_text)
+ clips = []
+ has_short_clip = False
+ for sub_lyric in splited_lyric:
+ sub_lyric_groups = re.findall(r"\d+,\d+", sub_lyric)
+ sub_lyric_1st_word_time_start = round(
+ int(sub_lyric_groups[0].split(",")[0]) / 1000, 3
+ )
+ sub_lyric_last_word_time_start = round(
+ int(sub_lyric_groups[-1].split(",")[0]) / 1000, 3
+ )
+ sub_lyric_last_word_duration = round(
+ int(sub_lyric_groups[-1].split(",")[-1]) / 1000, 3
+ )
+ sub_lyric_last_word_time_end = (
+ sub_lyric_last_word_time_start + sub_lyric_last_word_duration
+ )
+ sub_lyric_duration = (
+ sub_lyric_last_word_time_end - sub_lyric_1st_word_time_start
+ )
+ if sub_lyric_duration <= th:
+ has_short_clip = True
+ break
+ sub_lyric_text = re.sub(r"\[\d+,\d+\]", "", sub_lyric)
+ sub_lyric_text = re.sub(r"\(\d+,\d+\)", "", sub_lyric_text)
+ # 使用原始lyric，而不是sub_lyric_text 主要是保留相关clip的歌词信息，便于语义连续
+ dct = {
+ "time_start": sub_lyric_1st_word_time_start,
+ "duration": sub_lyric_duration,
+ "text": sub_lyric_text,
+ "original_text": lyric_text,
+ "timepoint_type": -1,
+ }
+ clips.append(dct)
+ if has_short_clip:
+ clips = [lyric2clip(lyric)]
+ return clips
+def is_songname(lyric: str) -> bool:
+ """是否是歌名，歌名文本含有ti, 如[ti:霍元甲 (《霍元甲》电影主题曲)]
+ Args:
+ lyric (str):
+ Returns:
+ bool:
+ """
+ return has_target_string(lyric, r"\[ti[:：]?")
+def get_songname(lyric: str) -> str:
+ """获取文本中的歌名，输入必须类似[ti:霍元甲 (《霍元甲》电影主题曲)]
+ Args:
+ lyric (str): 含有歌名的QRC文本行
+ Returns:
+ str: 歌名
+ """
+ return lyric.split("(")[0][4:-1]
+def is_album(lyric: str) -> bool:
+ """是否含有专辑名，文本必须类似[al:霍元甲]
+ Args:
+ lyric (str): _description_
+ Returns:
+ bool: _description_
+ """
+ return has_target_string(lyric, r"\[al[:：]?")
+def get_album(lyric: str) -> str:
+ """提取专辑名，文本必须类似[al:霍元甲]
+ Args:
+ lyric (str): 含有专辑名的QRC文本行
+ Returns:
+ str: 专辑名
+ """
+ return lyric[4:-1]
+def is_singer(lyric: str) -> bool:
+ """是否有歌手名，目标文本类似 [ar:周杰伦]
+ Args:
+ lyric (str): _description_
+ Returns:
+ bool: _description_
+ """
+ return has_target_string(lyric, r"\[ar[:：]?")
+def get_singer(lyric: str) -> str:
+ """提取歌手信息，文本必须类似[ar:周杰伦]
+ Args:
+ lyric (str): 含有歌手名的QRC文本行
+ Returns:
+ str: 歌手名
+ """
+ return lyric[4:-1]
+def lyric2musicinfo(lyric: str) -> dict:
+ """convert lyric content from str into musicinfo, a dict
+ 参考https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/media/media_info.py#L19
+ {
+ "meta_info": {},
+ "sub_meta_info": [],
+ "clips": [
+ clip
+ ]
+ }
+ Args:
+ lyric (str): 来自QRC的歌词字符串
+ Returns:
+ musicinfo: 音乐谱面字典，https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/media/media_info.py#L19
+ """
+ lyrics = lyric["QrcInfos"]["LyricInfo"]["Lyric_1"]["@LyricContent"]
+ musicinfo = {
+ "meta_info": {
+ "mediaid": None,
+ "media_name": None,
+ "singer": None,
+ },
+ "sub_meata_info": {},
+ "clips": [],
+ }
+ # lyrics = [line.strip() for line in re.split(r"[\t\n\s+]", lyrics)]
+ lyrics = ["[" + line.strip() for line in re.split(r"\[", lyrics)]
+ next_is_title_row = False
+ lyric_clips = []
+ for line in lyrics:
+ if is_songname(line):
+ musicinfo["meta_info"]["media_name"] = get_songname(line)
+ continue
+ if is_singer(line):
+ musicinfo["meta_info"]["singer"] = get_singer(line)
+ continue
+ if is_album(line):
+ musicinfo["meta_info"]["album"] = get_album(line)
+ continue
+ is_lyric_row = check_is_lyric_row(line)
+ if next_is_title_row:
+ next_is_title_row = False
+ continue
+ # remove tille row
+ if not next_is_title_row and re.search(r"\[offset[:：]", line):
+ next_is_title_row = True
+ if is_lyric_row and re.match(r"\[\d+,\d+\]", line):
+ lyric_clip = lyric2clip(line)
+ lyric_clips.append(lyric_clip)
+ clips = lyric2clips(line)
+ musicinfo["clips"].extend(clips)
+ musicinfo["meta_info"]["lyric"] = lyric_clips
+ return musicinfo
+def lrc_timestr2time(time_str: str) -> float:
+ """提取lrc中的时间戳文本，类似[00:00.00]，转化成秒的浮点数
+ Args:
+ time_str (str):
+ Returns:
+ float: 时间浮点数
+ """
+ m, s, ms = (float(x) for x in re.split(r"[:.]", time_str))
+ return round((m * 60 + s + ms / 1000), 3)
+def get_lrc_line_time(text: str, time_pattern: str) -> str:
+ """提取lrc中的时间字符串, 类似 \"[00:00.00]本字幕由天琴实验室独家AI字幕技术生成\"
+ Args:
+ text (str): 输入文本
+ time_pattern (str): 时间字符串正则表达式
+ Returns:
+ str: 符合正则表达式的时间信息文本
+ """
+ time_str = re.search(time_pattern, text).group(0)
+ return lrc_timestr2time(time_str)
+def lrc_lyric2clip(lyric: str, time_pattern: str, duration: float) -> dict:
+ """将一行lrc文本字符串转化为Clip 字典
+ Args:
+ lyric (str): 类似 \"[00:00.00]本字幕由天琴实验室独家AI字幕技术生成\"
+ time_pattern (str): 时间字符串正则表达式，类似 r"\d+:\d+\.\d+"
+ duration (float): clip的时长信息，
+ Returns:
+ dict: 转化后Clip
+ Clip定义可以参考 https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/media/clip.py
+ """
+ time_str = get_lrc_line_time(lyric, time_pattern=time_pattern)
+ text = re.sub(time_pattern, "", lyric)
+ text = text[2:]
+ clip = {
+ "time_start": time_str,
+ "duration": duration,
+ "text": text,
+ "timepoint_type": -1,
+ }
+ return clip
+def lrc2musicinfo(lyric: str, time_pattern: str = "\d+:\d+\.\d+") -> dict:
+ """将lrc转化为音乐谱面
+ Args:
+ lyric (str): lrc文本路径
+ time_pattern (str, optional): lrc时间戳字符串正则表达式. Defaults to "\d+:\d+\.\d+".
+ Returns:
+ dict: 生成的音乐谱面字典，定义可参考 https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/music/music_info.py
+ """
+ if isinstance(lyric, str):
+ if os.path.isfile(lyric):
+ with open(lyric, "r") as f:
+ lyric = [line.strip() for line in f.readlines()]
+ return lrc2musicinfo(lyric)
+ else:
+ lyric = lyric.split("\n")
+ return lrc2musicinfo(lyric)
+ else:
+ musicinfo = {
+ "meta_info": {
+ "mediaid": None,
+ "media_name": None,
+ "singer": None,
+ },
+ "sub_meata_info": {},
+ "clips": [],
+ }
+ # lyrics = [line.strip() for line in re.split(r"[\t\n\s+]", lyrics)]
+ lyric_clips = []
+ rows = len(lyric)
+ for i, line in enumerate(lyric):
+ if is_songname(line):
+ musicinfo["meta_info"]["media_name"] = line[4:-1]
+ continue
+ if is_singer(line):
+ musicinfo["meta_info"]["singer"] = line[4:-1]
+ continue
+ if is_album(line):
+ musicinfo["meta_info"]["album"] = line[4:-1]
+ continue
+ if len(re.findall(time_pattern, line)) > 0:
+ if i < rows - 1:
+ time_start = get_lrc_line_time(line, time_pattern=time_pattern)
+ next_line_time_start = get_lrc_line_time(
+ lyric[i + 1], time_pattern=time_pattern
+ )
+ duration = next_line_time_start - time_start
+ else:
+ duration = 1
+ clip = lrc_lyric2clip(
+ line, duration=duration, time_pattern=time_pattern
+ )
+ musicinfo["clips"].append(clip)
+ musicinfo["meta_info"]["lyric"] = lyric_clips
+ return musicinfo
+def lyricfile2musicinfo(path: str) -> dict:
+ """将歌词文件转化为音乐谱面，歌词文件可以是QRC的xml文件、也可以是lrc对应的lrc文件
+ TODO： 待支持osu
+ Args:
+ path (str): 歌词文件路径
+ Returns:
+ dict: 音乐谱面字典，定义可参考 https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/music/music_info.py
+ """
+ filename, ext = os.path.basename(path).split(".")
+ if ext == "xml":
+ lyric = read_xml2json(path)
+ musicinfo = lyric2musicinfo(lyric)
+ elif ext == "lrc":
+ musicinfo = lrc2musicinfo(path)
+ musicinfo["meta_info"]["mediaid"] = filename
+ return musicinfo

MuseV/MMCM/mmcm/music/music_map/meta_info.py ADDED Viewed

	@@ -0,0 +1,21 @@

+from __future__ import annotations
+from ...data import MetaInfo
+class MusicMetaInfo(MetaInfo):
+ def __init__(self, mediaid=None, media_name=None, media_duration=None, signature=None, media_path: str = None, media_map_path: str = None,
+ singer=None,
+ lyric_path=None,
+ genre=None,
+ language=None,
+ start: float = None, end: float = None, ext=None, **kwargs):
+ super().__init__(mediaid, media_name, media_duration, signature, media_path, media_map_path, start, end, ext, **kwargs)
+ self.singer = singer
+ self.genre = genre
+ self.language = language
+ self.lyric_path = lyric_path
+ @classmethod
+ def from_data(cls, data) -> MusicMetaInfo:
+ return MusicMetaInfo(**data)

MuseV/MMCM/mmcm/music/music_map/mss_map.py ADDED Viewed

	@@ -0,0 +1,185 @@

+import logging
+from .music_clip import MusicClip, MusicClipSeq
+from .music_map import MusicMap
+from ...data.clip.clip_process import find_idx_by_time
+logger = logging.getLogger(__name__) # pylint: disable=invalid-name
+def insert_mss_2_clipseq(
+ clipseq: MusicClipSeq, mss_clipseq: MusicClipSeq
+) -> MusicClipSeq:
+ """将mss中的结构字段信息赋予到目标clipseq中的最近clip
+ Args:
+ clipseq (ClipSeq): 目标clip序列
+ reference (ClipSeq): 参考clip序列
+ field (str): 目标字段
+ Returns:
+ ClipSeq: 更新目标字段新值后的clip序列
+ """
+ for i, clip in enumerate(clipseq):
+ idx = find_idx_by_time(mss_clipseq, clip.time_start)
+ if idx is not None:
+ clipseq[i].stage = mss_clipseq[idx].stage
+ else:
+ clipseq[i].stage = "unknow"
+ return clipseq
+def get_mss_musicinfo(songid: str) -> MusicMap:
+ """通过调用media_data中的接口 获取天琴实验室的歌曲结构信息
+ Args:
+ songid (str): 歌词id
+ Returns:
+ MusicMap: mss结构信息生成的音乐谱面
+ """
+ try:
+ from media_data.oi.tianqin_database import get_mss
+ mss = get_mss(songid=songid)
+ except Exception as e:
+ logger.warning("get mss failed, mss={}".format(songid))
+ logger.exception(e)
+ mss = None
+ mss_musicinfo = MusicMap(mss) if mss is not None else None
+ return mss_musicinfo
+def merge_mss(musicinfo: MusicMap, mss: MusicMap) -> MusicMap:
+ """融合mss音乐谱面到目标音乐谱面
+ Args:
+ musicinfo (MusicMap): 目标音乐谱面
+ mss (MusicMap): 待融合的mss音乐谱面
+ Returns:
+ MusicMap: 融合后的音乐谱面
+ """
+ musicinfo.meta_info.bpm = mss.meta_info.bpm
+ if len(mss.clipseq) > 0:
+ musicinfo.clipseq = insert_mss_2_clipseq(musicinfo.clipseq, mss.clipseq)
+ return musicinfo
+def generate_mss_from_lyric(lyrics: list, audio_duration: float, th=8) -> MusicClipSeq:
+ # "intro", "VA", "CA", "bridge", "VB", "CB", "end"]
+ mss = []
+ n_lyric = len(lyrics)
+ for lyric_idx, line_lyric_dct in enumerate(lyrics):
+ time_start = line_lyric_dct["time_start"]
+ duration = line_lyric_dct["duration"]
+ time_end = time_start + duration
+ # text = line_lyric_dct["text"]
+ if lyric_idx == 0:
+ sub_mss = {
+ "stage": "intro",
+ "time_start": 0,
+ "duration": time_start,
+ }
+ mss.append(sub_mss)
+ continue
+ if lyric_idx == n_lyric - 1:
+ sub_mss = {
+ "stage": "end",
+ "time_start": time_end,
+ "duration": audio_duration - time_end,
+ }
+ mss.append(sub_mss)
+ continue
+ if lyrics[lyric_idx + 1]["time_start"] - time_end >= th:
+ sub_mss = {
+ "stage": "bridge",
+ "time_start": time_end,
+ "duration": lyrics[lyric_idx + 1]["time_start"] - time_end,
+ }
+ mss.append(sub_mss)
+ mss_lyric = []
+ for sub_idx, sub_mss in enumerate(mss):
+ if sub_idx == len(mss) - 1:
+ continue
+ time_end = sub_mss["time_start"] + sub_mss["duration"]
+ next_time_start = mss[sub_idx + 1]["time_start"]
+ if next_time_start - time_end > 0.1:
+ mss_lyric.append(
+ {
+ "stage": "lyric",
+ "time_start": time_end,
+ "duration": next_time_start - time_end,
+ }
+ )
+ mss.extend(mss_lyric)
+ mss = sorted(mss, key=lambda x: x["time_start"])
+ mss = MusicClipSeq(mss)
+ return mss
+def refine_mss_info_from_tianqin(
+ mss_info: MusicMap, lyricseq: MusicClipSeq
+) -> MusicMap:
+ """优化天琴的歌曲结信息,
+ 优化前：天琴歌曲结构里面只有每句歌词和结构信息，时间前后不连续，对于整首歌去时间结构不完备。
+ 优化后：增加intro,bridge,end，将相近的结构信息合并，时间前后连续，时间完备
+ Args:
+ mss_info (MusicMap): 天琴歌曲结构
+ lyricseq (ClipSeq): 原始歌曲信息，用于计算Intro,bridge,end。其实也可以从mss_info中获取。
+ Returns:
+ MusicMap: 优化后的歌曲结构信息
+ """
+ lyric_mss_clipseq = generate_mss_from_lyric(
+ lyricseq, audio_duration=mss_info.meta_info.duration
+ )
+ new_mss_clipseq = []
+ # lyric_mss_dct = lyric_mss_clipseq.to_dct()
+ # mss_dct = mss_info.clipseq.to_dct()
+ for l_clip_idx, lyric_clip in enumerate(lyric_mss_clipseq):
+ if lyric_clip.stage != "lyric":
+ new_mss_clipseq.append(lyric_clip)
+ else:
+ new_clip_time_start = lyric_clip.time_start
+ last_stage = "ANewClipStart"
+ for clip_idx, clip in enumerate(mss_info.clipseq):
+ if clip.time_start < new_clip_time_start:
+ continue
+ if (
+ clip.time_start >= lyric_mss_clipseq[l_clip_idx + 1].time_start
+ or clip_idx == len(mss_info.clipseq) - 1
+ ):
+ if clip.time_start >= lyric_mss_clipseq[l_clip_idx + 1].time_start:
+ stage = last_stage
+ # 像偶阵雨这首歌最后一个歌词段落 只有一句歌词
+ if clip_idx == len(mss_info.clipseq) - 1:
+ stage = clip.stage
+ new_clip_time_end = lyric_mss_clipseq[l_clip_idx + 1].time_start
+ new_stage_clip = {
+ "time_start": new_clip_time_start,
+ "duration": new_clip_time_end - new_clip_time_start,
+ "stage": stage,
+ }
+ new_mss_clipseq.append(MusicClip(**new_stage_clip))
+ new_clip_time_start = new_clip_time_end
+ last_stage = clip.stage
+ break
+ if clip.stage != last_stage:
+ if last_stage == "ANewClipStart":
+ last_stage = clip.stage
+ continue
+ new_clip_time_end = mss_info.clipseq[clip_idx].time_start
+ new_stage_clip = {
+ "time_start": new_clip_time_start,
+ "duration": new_clip_time_end - new_clip_time_start,
+ "stage": last_stage,
+ }
+ new_mss_clipseq.append(MusicClip(**new_stage_clip))
+ new_clip_time_start = new_clip_time_end
+ last_stage = clip.stage
+ new_mss_clipseq = MusicClipSeq(sorted(new_mss_clipseq, key=lambda x: x.time_start))
+ mss_info.clipseq = new_mss_clipseq
+ return mss_info

MuseV/MMCM/mmcm/music/music_map/music_clip.py ADDED Viewed

	@@ -0,0 +1,83 @@

+from __future__ import annotations
+from typing import Dict, List
+from ...data.clip import Clip, ClipSeq
+class MusicClip(Clip):
+ def __init__(self, time_start: float, duration: float, clipid: int = None, media_type: str = None, mediaid: str = None, timepoint_type: str = None, text: str = None, stage: str = None, path: str = None, duration_num: int = None, similar_clipseq: MatchedClipIds = None, dynamic: float = None, **kwargs):
+ super().__init__(time_start, duration, clipid, media_type, mediaid, timepoint_type, text, stage, path, duration_num, similar_clipseq, dynamic, **kwargs)
+ @property
+ def text_num(self):
+ return self._cal_text_num()
+ @property
+ def original_text_num(self):
+ return self._cal_text_num(text_mode=1)
+ def _cal_text_num(self, text_mode: int = 0) -> int:
+ """计算 文本 字的数量
+ Args:
+ text_mode (int, optional): 0选text， 其他选original_text. Defaults to 0.
+ Returns:
+ int: _description_
+ """
+ if text_mode == 0:
+ text = self.text
+ else:
+ text = self.original_text
+ if text is None:
+ n_text = 0
+ else:
+ text = text.strip().split(" ")
+ n_text = len(text)
+ return n_text
+ @property
+ def text_num_per_second(self):
+ """单位时间内的text数量"""
+ return self._cal_text_num_per_second(mode=0)
+ @property
+ def original_text_num_per_second(self):
+ """单位时间内的original_text数量"""
+ return self._cal_text_num_per_second(mode=1)
+ @property
+ def tnps(self):
+ """单位时间内的text数量"""
+ return self.text_num_per_second
+ @property
+ def original_tnps(self):
+ """单位时间内的original_text数量"""
+ return self.original_text_num_per_second
+ def _cal_text_num_per_second(self, mode=0):
+ """计算单位时间内的文本数量"""
+ text_num = self.text_num if mode == 0 else self.original_text_num
+ return text_num / self.duration
+ @classmethod
+ def from_data(cls, data: Dict):
+ return MusicClip(**data)
+class MusicClipSeq(ClipSeq):
+ def __init__(self, items: List[Clip] = None):
+ super().__init__(items)
+ self.clipseq = self.data
+ @classmethod
+ def from_data(cls, clipseq: List[Dict]) -> MusicClipSeq:
+ new_clipseq = []
+ for clip in clipseq:
+ video_clip = MusicClip.from_data(clip)
+ new_clipseq.append(video_clip)
+ video_clipseq = MusicClipSeq(new_clipseq)
+ return video_clipseq

MuseV/MMCM/mmcm/music/music_map/music_map.py ADDED Viewed

	@@ -0,0 +1,140 @@

+from __future__ import annotations
+from typing import List, Dict
+from moviepy.editor import concatenate_audioclips, AudioClip, AudioFileClip
+from ...data import MediaMap, MediaMapEmb, MetaInfo, MediaMapSeq
+from ...data.clip.clip_process import find_time_by_stage
+from ...data.emb.h5py_emb import H5pyMediaMapEmb
+from ...utils.util import load_dct_from_file
+from .clip_process import get_stageseq_from_clipseq
+from .music_clip import MusicClip, MusicClipSeq
+from .meta_info import MusicMetaInfo
+class MusicMap(MediaMap):
+ def __init__(
+ self,
+ meta_info: MetaInfo,
+ clipseq: MusicClipSeq,
+ lyricseq: MusicClipSeq = None,
+ stageseq: MusicClipSeq = None,
+ frameseq: MusicClipSeq = None,
+ emb: MediaMapEmb = None,
+ **kwargs,
+ ):
+ self.lyricseq = lyricseq
+ super().__init__(meta_info, clipseq, stageseq, frameseq, emb, **kwargs)
+ if self.stageseq is None:
+ self.stageseq = MusicClipSeq.from_data(
+ get_stageseq_from_clipseq(self.clipseq)
+ )
+ self.stageseq.preprocess()
+ def preprocess(self):
+ if (
+ hasattr(self.meta_info, "target_stages")
+ and self.meta_info.target_stages is not None
+ ):
+ self.set_start_end_by_target_stages()
+ super().preprocess()
+ self.spread_metainfo_2_clip(
+ target_keys=[
+ "media_path",
+ "media_map_path",
+ "emb_path",
+ "media_duration",
+ "mediaid",
+ "media_name",
+ "emb",
+ ]
+ )
+ def set_start_end_by_target_stages(self):
+ target_stages = self.meta_info.target_stages
+ if not isinstance(target_stages, List):
+ target_stages = [target_stages]
+ start, _ = find_time_by_stage(self.stageseq, target_stages[0])
+ _, end = find_time_by_stage(self.stageseq, target_stages[-1])
+ self.meta_info.start = start
+ self.meta_info.end = end
+ @property
+ def audio_clip(self) -> AudioFileClip:
+ """读取实际ClipSeq中的音频
+ Returns:
+ AudioClip: Moviepy中的audio_clip
+ """
+ audio_clip = AudioFileClip(self.meta_info.media_path)
+ audio_clip = audio_clip.subclip(self.meta_info.start, self.meta_info.end)
+ return audio_clip
+ @classmethod
+ def from_json_path(
+ cls, path: Dict, emb_path: str, media_path: str = None, **kwargs
+ ) -> MusicMap:
+ media_map = load_dct_from_file(path)
+ emb = H5pyMediaMapEmb(emb_path)
+ return cls.from_data(media_map, emb=emb, media_path=media_path, **kwargs)
+ @classmethod
+ def from_data(
+ cls, data: Dict, emb: H5pyMediaMapEmb, media_path: str = None, **kwargs
+ ) -> MusicMap:
+ meta_info = MusicMetaInfo.from_data(data.get("meta_info", {}))
+ meta_info.media_path = media_path
+ clipseq = MusicClipSeq.from_data(data.get("clipseq", []))
+ stageseq = MusicClipSeq.from_data(data.get("stageseq", []))
+ lyricseq = MusicClipSeq.from_data(data.get("lyricseq", []))
+ target_keys = ["meta_info", "clipseq", "frameseq", "stageseq", "lyricseq"]
+ dct = {k: data[k] for k in data.keys() if k not in target_keys}
+ dct.update(**kwargs)
+ video_map = MusicMap(
+ meta_info=meta_info,
+ clipseq=clipseq,
+ stageseq=stageseq,
+ lyricseq=lyricseq,
+ emb=emb,
+ **dct,
+ )
+ return video_map
+ def to_dct(
+ self, target_keys: List[str] = None, ignored_keys: List[str] = None
+ ) -> Dict:
+ dct = {}
+ dct["meta_info"] = self.meta_info.to_dct(
+ target_keys=target_keys, ignored_keys=ignored_keys
+ )
+ dct["clipseq"] = self.clipseq.to_dct(
+ target_keys=target_keys, ignored_keys=ignored_keys
+ )
+ if self.frameseq is not None:
+ dct["frameseq"] = self.frameseq.to_dct(
+ target_keys=target_keys, ignored_keys=ignored_keys
+ )
+ else:
+ dct["frameseq"] = None
+ if self.stageseq is not None:
+ dct["stageseq"] = self.stageseq.to_dct(
+ target_keys=target_keys, ignored_keys=ignored_keys
+ )
+ else:
+ dct["stageseq"] = None
+ dct["lyricseq"] = self.lyricseq.to_dct(
+ target_keys=target_keys, ignored_keys=ignored_keys
+ )
+ return dct
+class MusicMapSeq(MediaMapSeq):
+ def __init__(self, maps: List[MusicMap]) -> None:
+ super().__init__(maps)
+ @property
+ def audio_clip(self) -> AudioFileClip:
+ audio_clip_lst = [m.audi_clip for m in self.maps]
+ audio_clip = concatenate_audioclips(audio_clip_lst)
+ return audio_clip

MuseV/MMCM/mmcm/music/music_map/music_map_demp.py ADDED Viewed

	@@ -0,0 +1,58 @@

+from moviepy.editor import (
+ ColorClip,
+ concatenate_videoclips,
+ AudioFileClip,
+ CompositeVideoClip,
+)
+from ...vision.video_map.video_lyric import render_lyric2video
+from ...vision.video_map.video_writer import write_videoclip
+from .music_map import MusicMap
+def generate_music_map_videodemo(
+ music_map: MusicMap,
+ path: str,
+ audio_path: str,
+ render_lyric: bool = True,
+ width: int = 360,
+ height: int = 240,
+ fps: int = 25,
+ n_thread: int = 8,
+ colors: list = [[51, 161, 201], [46, 139, 87]],
+) -> None:
+ """输入音乐谱面，生成对应的转场视频Demo，视频内容只是简单的颜色切换
+ Args:
+ music_map (MusicInfo): 待可视化的音乐谱面
+ path (str): 可视化视频的存储路径
+ audio_path (str): 音乐谱面对应的音频路径
+ render_lyric (bool, optional): 是否渲染歌词，歌词在音乐谱面中. Defaults to True.
+ width (int, optional): 可视化视频的宽. Defaults to 360.
+ height (int, optional): 可视化视频的高. Defaults to 240.
+ fps (int, optional): 可视化视频的fps. Defaults to 25.
+ n_thread (int, optional): 可视化视频的写入线程数. Defaults to 8.
+ colors (list, optional): 可视化的视频颜色. Defaults to [[51, 161, 201], [46, 139, 87]].
+ """
+ audio_clip = AudioFileClip(audio_path)
+ video_clips = []
+ size = (width, height)
+ for i, clip in enumerate(music_map.clipseq):
+ clip = ColorClip(
+ size=size, color=colors[i % len(colors)], duration=clip.duration
+ )
+ video_clips.append(clip)
+ video_clips = concatenate_videoclips(video_clips, method="compose")
+ if render_lyric:
+ video_clips = render_lyric2video(
+ videoclip=video_clips,
+ lyric=music_map,
+ lyric_info_type="music_map",
+ )
+ video_clips = video_clips.set_audio(audio_clip)
+ write_videoclip(
+ video_clips,
+ path=path,
+ fps=fps,
+ n_thread=n_thread,
+ )

MuseV/MMCM/mmcm/music/utils/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/music/utils/path_util.py ADDED Viewed

	@@ -0,0 +1,9 @@

+import os
+from typing import Dict, Tuple
+from ...utils.path_util import get_dir_file_map
+def get_audio_path_dct(path, exts=["mp3", "flac", "wav"]) -> Dict[str, str]:
+ """遍历目标文件夹及子文件夹下所有音频文件，生成字典。"""
+ return get_dir_file_map(path, exts=exts)

MuseV/MMCM/mmcm/t2p/.gitignore ADDED Viewed

	@@ -0,0 +1,158 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+# Usually these files are written by a python script from a template
+# before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+# However, in case of collaboration, if having platform-specific dependencies or dependencies
+# having no cross-platform support, pipenv may install dependencies that don't work, or not
+# install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+.vscode
+dataset/dataset_TM_train_cb1_temp.py
+train_gpt_cnn_temp.py
+train_gpt_cnn_mask.py
+start.sh
+start_eval.sh
+config.json
+output_GPT_Final
+output_vqfinal
+output_transformer
+glove
+checkpoints
+dataset/HumanML3D
+dataset/KIT-ML
+output
+matrix_multi.py
+body_models
+render_final_diffuse.py
+render_final_mdm.py
+pretrained
+MDM
+Motiondiffusion
+Visualize_temp.py
+new.sh
+T2M_render
+render_final_t2m.py
+pose

MuseV/MMCM/mmcm/t2p/GPT_eval_multi.py ADDED Viewed

	@@ -0,0 +1,121 @@

+import os
+import torch
+import numpy as np
+from torch.utils.tensorboard import SummaryWriter
+import json
+import clip
+import options.option_transformer as option_trans
+import models.vqvae as vqvae
+import utils.utils_model as utils_model
+import utils.eval_trans as eval_trans
+from dataset import dataset_TM_eval
+import models.t2m_trans as trans
+from options.get_eval_option import get_opt
+from models.evaluator_wrapper import EvaluatorModelWrapper
+import warnings
+warnings.filterwarnings('ignore')
+##### ---- Exp dirs ---- #####
+args = option_trans.get_args_parser()
+torch.manual_seed(args.seed)
+args.out_dir = os.path.join(args.out_dir, f'{args.exp_name}')
+os.makedirs(args.out_dir, exist_ok = True)
+##### ---- Logger ---- #####
+logger = utils_model.get_logger(args.out_dir)
+writer = SummaryWriter(args.out_dir)
+logger.info(json.dumps(vars(args), indent=4, sort_keys=True))
+from utils.word_vectorizer import WordVectorizer
+w_vectorizer = WordVectorizer('./glove', 'our_vab')
+val_loader = dataset_TM_eval.DATALoader(args.dataname, True, 32, w_vectorizer)
+dataset_opt_path = 'checkpoints/kit/Comp_v6_KLD005/opt.txt' if args.dataname == 'kit' else 'checkpoints/t2m/Comp_v6_KLD005/opt.txt'
+wrapper_opt = get_opt(dataset_opt_path, torch.device('cuda'))
+eval_wrapper = EvaluatorModelWrapper(wrapper_opt)
+##### ---- Network ---- #####
+## load clip model and datasets
+clip_model, clip_preprocess = clip.load("ViT-B/32", device=torch.device('cuda'), jit=False) # Must set jit=False for training
+clip.model.convert_weights(clip_model) # Actually this line is unnecessary since clip by default already on float16
+clip_model.eval()
+for p in clip_model.parameters():
+ p.requires_grad = False
+net = vqvae.HumanVQVAE(args, ## use args to define different parameters in different quantizers
+ args.nb_code,
+ args.code_dim,
+ args.output_emb_width,
+ args.down_t,
+ args.stride_t,
+ args.width,
+ args.depth,
+ args.dilation_growth_rate)
+trans_encoder = trans.Text2Motion_Transformer(num_vq=args.nb_code,
+ embed_dim=args.embed_dim_gpt,
+ clip_dim=args.clip_dim,
+ block_size=args.block_size,
+ num_layers=args.num_layers,
+ n_head=args.n_head_gpt,
+ drop_out_rate=args.drop_out_rate,
+ fc_rate=args.ff_rate)
+print ('loading checkpoint from {}'.format(args.resume_pth))
+ckpt = torch.load(args.resume_pth, map_location='cpu')
+net.load_state_dict(ckpt['net'], strict=True)
+net.eval()
+net.cuda()
+if args.resume_trans is not None:
+ print ('loading transformer checkpoint from {}'.format(args.resume_trans))
+ ckpt = torch.load(args.resume_trans, map_location='cpu')
+ trans_encoder.load_state_dict(ckpt['trans'], strict=True)
+trans_encoder.train()
+trans_encoder.cuda()
+fid = []
+div = []
+top1 = []
+top2 = []
+top3 = []
+matching = []
+multi = []
+repeat_time = 20
+for i in range(repeat_time):
+ best_fid, best_iter, best_div, best_top1, best_top2, best_top3, best_matching, best_multi, writer, logger = eval_trans.evaluation_transformer_test(args.out_dir, val_loader, net, trans_encoder, logger, writer, 0, best_fid=1000, best_iter=0, best_div=100, best_top1=0, best_top2=0, best_top3=0, best_matching=100, best_multi=0, clip_model=clip_model, eval_wrapper=eval_wrapper, draw=False, savegif=False, save=False, savenpy=(i==0))
+ fid.append(best_fid)
+ div.append(best_div)
+ top1.append(best_top1)
+ top2.append(best_top2)
+ top3.append(best_top3)
+ matching.append(best_matching)
+ multi.append(best_multi)
+print('final result:')
+print('fid: ', sum(fid)/repeat_time)
+print('div: ', sum(div)/repeat_time)
+print('top1: ', sum(top1)/repeat_time)
+print('top2: ', sum(top2)/repeat_time)
+print('top3: ', sum(top3)/repeat_time)
+print('matching: ', sum(matching)/repeat_time)
+print('multi: ', sum(multi)/repeat_time)
+fid = np.array(fid)
+div = np.array(div)
+top1 = np.array(top1)
+top2 = np.array(top2)
+top3 = np.array(top3)
+matching = np.array(matching)
+multi = np.array(multi)
+msg_final = f"FID. {np.mean(fid):.3f}, conf. {np.std(fid)*1.96/np.sqrt(repeat_time):.3f}, Diversity. {np.mean(div):.3f}, conf. {np.std(div)*1.96/np.sqrt(repeat_time):.3f}, TOP1. {np.mean(top1):.3f}, conf. {np.std(top1)*1.96/np.sqrt(repeat_time):.3f}, TOP2. {np.mean(top2):.3f}, conf. {np.std(top2)*1.96/np.sqrt(repeat_time):.3f}, TOP3. {np.mean(top3):.3f}, conf. {np.std(top3)*1.96/np.sqrt(repeat_time):.3f}, Matching. {np.mean(matching):.3f}, conf. {np.std(matching)*1.96/np.sqrt(repeat_time):.3f}, Multi. {np.mean(multi):.3f}, conf. {np.std(multi)*1.96/np.sqrt(repeat_time):.3f}"
+logger.info(msg_final)

MuseV/MMCM/mmcm/t2p/LICENSE ADDED Viewed

	@@ -0,0 +1,201 @@

+ Apache License
+ Version 2.0, January 2004
+ http://www.apache.org/licenses/
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+ 1. Definitions.
+ "License" shall mean the terms and conditions for use, reproduction,
+ and distribution as defined by Sections 1 through 9 of this document.
+ "Licensor" shall mean the copyright owner or entity authorized by
+ the copyright owner that is granting the License.
+ "Legal Entity" shall mean the union of the acting entity and all
+ other entities that control, are controlled by, or are under common
+ control with that entity. For the purposes of this definition,
+ "control" means (i) the power, direct or indirect, to cause the
+ direction or management of such entity, whether by contract or
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
+ outstanding shares, or (iii) beneficial ownership of such entity.
+ "You" (or "Your") shall mean an individual or Legal Entity
+ exercising permissions granted by this License.
+ "Source" form shall mean the preferred form for making modifications,
+ including but not limited to software source code, documentation
+ source, and configuration files.
+ "Object" form shall mean any form resulting from mechanical
+ transformation or translation of a Source form, including but
+ not limited to compiled object code, generated documentation,
+ and conversions to other media types.
+ "Work" shall mean the work of authorship, whether in Source or
+ Object form, made available under the License, as indicated by a
+ copyright notice that is included in or attached to the work
+ (an example is provided in the Appendix below).
+ "Derivative Works" shall mean any work, whether in Source or Object
+ form, that is based on (or derived from) the Work and for which the
+ editorial revisions, annotations, elaborations, or other modifications
+ represent, as a whole, an original work of authorship. For the purposes
+ of this License, Derivative Works shall not include works that remain
+ separable from, or merely link (or bind by name) to the interfaces of,
+ the Work and Derivative Works thereof.
+ "Contribution" shall mean any work of authorship, including
+ the original version of the Work and any modifications or additions
+ to that Work or Derivative Works thereof, that is intentionally
+ submitted to Licensor for inclusion in the Work by the copyright owner
+ or by an individual or Legal Entity authorized to submit on behalf of
+ the copyright owner. For the purposes of this definition, "submitted"
+ means any form of electronic, verbal, or written communication sent
+ to the Licensor or its representatives, including but not limited to
+ communication on electronic mailing lists, source code control systems,
+ and issue tracking systems that are managed by, or on behalf of, the
+ Licensor for the purpose of discussing and improving the Work, but
+ excluding communication that is conspicuously marked or otherwise
+ designated in writing by the copyright owner as "Not a Contribution."
+ "Contributor" shall mean Licensor and any individual or Legal Entity
+ on behalf of whom a Contribution has been received by Licensor and
+ subsequently incorporated within the Work.
+ 2. Grant of Copyright License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ copyright license to reproduce, prepare Derivative Works of,
+ publicly display, publicly perform, sublicense, and distribute the
+ Work and such Derivative Works in Source or Object form.
+ 3. Grant of Patent License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ (except as stated in this section) patent license to make, have made,
+ use, offer to sell, sell, import, and otherwise transfer the Work,
+ where such license applies only to those patent claims licensable
+ by such Contributor that are necessarily infringed by their
+ Contribution(s) alone or by combination of their Contribution(s)
+ with the Work to which such Contribution(s) was submitted. If You
+ institute patent litigation against any entity (including a
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
+ or a Contribution incorporated within the Work constitutes direct
+ or contributory patent infringement, then any patent licenses
+ granted to You under this License for that Work shall terminate
+ as of the date such litigation is filed.
+ 4. Redistribution. You may reproduce and distribute copies of the
+ Work or Derivative Works thereof in any medium, with or without
+ modifications, and in Source or Object form, provided that You
+ meet the following conditions:
+ (a) You must give any other recipients of the Work or
+ Derivative Works a copy of this License; and
+ (b) You must cause any modified files to carry prominent notices
+ stating that You changed the files; and
+ (c) You must retain, in the Source form of any Derivative Works
+ that You distribute, all copyright, patent, trademark, and
+ attribution notices from the Source form of the Work,
+ excluding those notices that do not pertain to any part of
+ the Derivative Works; and
+ (d) If the Work includes a "NOTICE" text file as part of its
+ distribution, then any Derivative Works that You distribute must
+ include a readable copy of the attribution notices contained
+ within such NOTICE file, excluding those notices that do not
+ pertain to any part of the Derivative Works, in at least one
+ of the following places: within a NOTICE text file distributed
+ as part of the Derivative Works; within the Source form or
+ documentation, if provided along with the Derivative Works; or,
+ within a display generated by the Derivative Works, if and
+ wherever such third-party notices normally appear. The contents
+ of the NOTICE file are for informational purposes only and
+ do not modify the License. You may add Your own attribution
+ notices within Derivative Works that You distribute, alongside
+ or as an addendum to the NOTICE text from the Work, provided
+ that such additional attribution notices cannot be construed
+ as modifying the License.
+ You may add Your own copyright statement to Your modifications and
+ may provide additional or different license terms and conditions
+ for use, reproduction, or distribution of Your modifications, or
+ for any such Derivative Works as a whole, provided Your use,
+ reproduction, and distribution of the Work otherwise complies with
+ the conditions stated in this License.
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
+ any Contribution intentionally submitted for inclusion in the Work
+ by You to the Licensor shall be under the terms and conditions of
+ this License, without any additional terms or conditions.
+ Notwithstanding the above, nothing herein shall supersede or modify
+ the terms of any separate license agreement you may have executed
+ with Licensor regarding such Contributions.
+ 6. Trademarks. This License does not grant permission to use the trade
+ names, trademarks, service marks, or product names of the Licensor,
+ except as required for reasonable and customary use in describing the
+ origin of the Work and reproducing the content of the NOTICE file.
+ 7. Disclaimer of Warranty. Unless required by applicable law or
+ agreed to in writing, Licensor provides the Work (and each
+ Contributor provides its Contributions) on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied, including, without limitation, any warranties or conditions
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+ PARTICULAR PURPOSE. You are solely responsible for determining the
+ appropriateness of using or redistributing the Work and assume any
+ risks associated with Your exercise of permissions under this License.
+ 8. Limitation of Liability. In no event and under no legal theory,
+ whether in tort (including negligence), contract, or otherwise,
+ unless required by applicable law (such as deliberate and grossly
+ negligent acts) or agreed to in writing, shall any Contributor be
+ liable to You for damages, including any direct, indirect, special,
+ incidental, or consequential damages of any character arising as a
+ result of this License or out of the use or inability to use the
+ Work (including but not limited to damages for loss of goodwill,
+ work stoppage, computer failure or malfunction, or any and all
+ other commercial damages or losses), even if such Contributor
+ has been advised of the possibility of such damages.
+ 9. Accepting Warranty or Additional Liability. While redistributing
+ the Work or Derivative Works thereof, You may choose to offer,
+ and charge a fee for, acceptance of support, warranty, indemnity,
+ or other liability obligations and/or rights consistent with this
+ License. However, in accepting such obligations, You may act only
+ on Your own behalf and on Your sole responsibility, not on behalf
+ of any other Contributor, and only if You agree to indemnify,
+ defend, and hold each Contributor harmless for any liability
+ incurred by, or claims asserted against, such Contributor by reason
+ of your accepting any such warranty or additional liability.
+ END OF TERMS AND CONDITIONS
+ APPENDIX: How to apply the Apache License to your work.
+ To apply the Apache License to your work, attach the following
+ boilerplate notice, with the fields enclosed by brackets "[]"
+ replaced with your own identifying information. (Don't include
+ the brackets!) The text should be enclosed in the appropriate
+ comment syntax for the file format. We also recommend that a
+ file or class name and description of purpose be included on the
+ same "printed page" as the copyright notice for easier
+ identification within third-party archives.
+ Copyright 2023 tencent
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.