Spaces:

anchorxia
/

MuseVSpace

Runtime error

App Files Files Community

anchorxia commited on Apr 11, 2024

Commit

a57c6eb

1 Parent(s): f7d3f4d

add mmcm

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

MuseV/MMCM/.gitignore +139 -0
MuseV/MMCM/Dockerfile +83 -0
MuseV/MMCM/README.md +2 -0
MuseV/MMCM/mmcm/__init__.py +6 -0
MuseV/MMCM/mmcm/audio/__init__.py +0 -0
MuseV/MMCM/mmcm/data/__init__.py +9 -0
MuseV/MMCM/mmcm/data/clip.py +324 -0
MuseV/MMCM/mmcm/data/clip/__init__.py +5 -0
MuseV/MMCM/mmcm/data/clip/clip.py +197 -0
MuseV/MMCM/mmcm/data/clip/clip_filter.py +46 -0
MuseV/MMCM/mmcm/data/clip/clip_fusion.py +64 -0
MuseV/MMCM/mmcm/data/clip/clip_process.py +366 -0
MuseV/MMCM/mmcm/data/clip/clip_stat.py +13 -0
MuseV/MMCM/mmcm/data/clip/clipid.py +70 -0
MuseV/MMCM/mmcm/data/crawl/__init__.py +0 -0
MuseV/MMCM/mmcm/data/crawl/download.py +72 -0
MuseV/MMCM/mmcm/data/crawl/error.py +20 -0
MuseV/MMCM/mmcm/data/crawl/ffmpeg.py +39 -0
MuseV/MMCM/mmcm/data/crawl/flicker.py +22 -0
MuseV/MMCM/mmcm/data/crawl/youtube.py +13 -0
MuseV/MMCM/mmcm/data/emb/__init__.py +2 -0
MuseV/MMCM/mmcm/data/emb/emb.py +104 -0
MuseV/MMCM/mmcm/data/emb/h5py_emb.py +119 -0
MuseV/MMCM/mmcm/data/emb/json_emb.py +0 -0
MuseV/MMCM/mmcm/data/emb/numpy_emb.py +0 -0
MuseV/MMCM/mmcm/data/extract_feature/__init__.py +0 -0
MuseV/MMCM/mmcm/data/extract_feature/base_extract_feature.py +28 -0
MuseV/MMCM/mmcm/data/general/__init__.py +1 -0
MuseV/MMCM/mmcm/data/general/items.py +69 -0
MuseV/MMCM/mmcm/data/media_map/__init__.py +1 -0
MuseV/MMCM/mmcm/data/media_map/media_map.py +393 -0
MuseV/MMCM/mmcm/data/media_map/media_map_process.py +72 -0
MuseV/MMCM/mmcm/music/__init__.py +6 -0
MuseV/MMCM/mmcm/music/music_map/__init__.py +0 -0
MuseV/MMCM/mmcm/music/music_map/beat_map.py +82 -0
MuseV/MMCM/mmcm/music/music_map/clip_process.py +196 -0
MuseV/MMCM/mmcm/music/music_map/convert_type.py +57 -0
MuseV/MMCM/mmcm/music/music_map/load_music_map.py +38 -0
MuseV/MMCM/mmcm/music/music_map/lyric_map.py +149 -0
MuseV/MMCM/mmcm/music/music_map/lyric_process.py +515 -0
MuseV/MMCM/mmcm/music/music_map/meta_info.py +21 -0
MuseV/MMCM/mmcm/music/music_map/mss_map.py +185 -0
MuseV/MMCM/mmcm/music/music_map/music_clip.py +83 -0
MuseV/MMCM/mmcm/music/music_map/music_map.py +140 -0
MuseV/MMCM/mmcm/music/music_map/music_map_demp.py +58 -0
MuseV/MMCM/mmcm/music/utils/__init__.py +0 -0
MuseV/MMCM/mmcm/music/utils/path_util.py +9 -0
MuseV/MMCM/mmcm/t2p/.gitignore +158 -0
MuseV/MMCM/mmcm/t2p/GPT_eval_multi.py +121 -0
MuseV/MMCM/mmcm/t2p/LICENSE +201 -0

MuseV/MMCM/.gitignore ADDED Viewed

	@@ -0,0 +1,139 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don’t work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# celery beat schedule file
+celerybeat-schedule
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+*.swp
+.*.swp
+dataset/files
+experiments
+log
+csvs
+.idea
+.vscode
+__pycache__/
+*.code-workspace
+.DS_Store
+third_party/
+.polaris_cache/
+*.lock

MuseV/MMCM/Dockerfile ADDED Viewed

	@@ -0,0 +1,83 @@

+# FROM mirrors.tencent.com/todacc/venus-std-base-cuda11.8:0.1.0
+FROM mirrors.tencent.com/todacc/venus-std-ext-cuda11.8-pytorch2.0-tf2.12-py3.10:0.7.0
+#MAINTAINER 维护者信息
+LABEL MAINTAINER="anchorxia"
+LABEL Email="[email protected]"
+LABEL Description="gpu development image, from mirrors.tencent.com/todacc/venus-std-ext-cuda11.8-pytorch2.0-tf2.12-py3.10:0.7.0"
+USER root
+# 安装必须软件
+# RUN GENERIC_REPO_URL="http://mirrors.tencent.com/repository/generic/venus_repo/image_res" \
+#     && cd /data/ \
+#     && wget -q $GENERIC_REPO_URL/gcc/gcc-11.2.0.zip \
+#     && unzip -q gcc-11.2.0.zip  \
+#     && cd gcc-releases-gcc-11.2.0 \
+#     && ./contrib/download_prerequisites \
+#     && ./configure --enable-bootstrap --enable-languages=c,c++ --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib \
+#     && make --silent -j10 \
+#     && make --silent install \
+#     && gcc -v \
+#     && rm -rf /data/gcc-releases-gcc-11.2.0 /data/gcc-11.2.0.zip
+# RUN yum update -y \
+#     && yum install -y epel-release \
+#     && yum install -y ffmpeg \
+#     && yum install -y Xvfb \
+#     && yum install -y centos-release-scl devtoolset-11
+RUN yum install -y wget zsh git curl tmux cmake htop iotop git-lfs zip \
+    && yum install -y autojump autojump-zsh portaudio portaudio-devel \
+    && yum clean all
+USER mqq
+RUN source ~/.bashrc \
+    && GENERIC_REPO_URL="http://mirrors.tencent.com/repository/generic/venus_repo/image_res" \
+    && conda deactivate \
+    # && conda remove -y -n env-2.7.18 --all \
+    # && conda remove -y -n env-3.6.8 --all \
+    # && conda remove -y -n env-3.7.7 --all \
+    # && conda remove -y -n env-3.8.8 --all \
+    # && conda remove -y -n env-3.9.2 --all \
+    # && conda remove -y -n env-novelai --all \
+    && conda create -n projectv python=3.10.6 -y \
+    && conda activate projectv \
+    && pip install venus-sdk -q -i https://mirrors.tencent.com/repository/pypi/tencent_pypi/simple \
+    --extra-index-url https://mirrors.tencent.com/pypi/simple/ \
+    && pip install tensorflow==2.12.0 tensorboard==2.12.0 \
+    && pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 -f https://mirror.sjtu.edu.cn/pytorch-wheels/torch_stable.html -i https://mirrors.bfsu.edu.cn/pypi/web/simple -U \
+    # 安装xformers，支持不同型号gpu
+    && pip install ninja==1.11.1 \
+    # && git clone https://github.com/facebookresearch/xformers.git \
+    # && cd xformers \
+    # && git checkout v0.0.17rc482 \
+    # && git submodule update --init --recursive \
+    # && pip install numpy==1.23.4 pyre-extensions==0.0.23 \
+    # && FORCE_CUDA="1" MAX_JOBS=1 TORCH_CUDA_ARCH_LIST="6.1;7.0;7.5;8.0;8.6" pip install -e . \
+    # && cd .. \
+    # 安装一堆包
+    && pip install --no-cache-dir transformers bitsandbytes decord accelerate xformers omegaconf einops imageio==2.31.1 \
+    && pip install --no-cache-dir pandas h5py matplotlib modelcards pynvml black pytest moviepy torch-tb-profiler scikit-learn librosa ffmpeg easydict webp controlnet_aux mediapipe \
+    && pip install --no-cache-dir Cython easydict gdown infomap insightface ipython librosa onnx onnxruntime onnxsim opencv_python Pillow protobuf pytube PyYAML \
+    && pip install --no-cache-dir requests scipy six tqdm gradio albumentations opencv-contrib-python imageio-ffmpeg pytorch-lightning test-tube \
+    && pip install --no-cache-dir timm addict yapf prettytable safetensors basicsr fvcore pycocotools wandb gunicorn \
+    && pip install --no-cache-dir streamlit webdataset kornia open_clip_torch streamlit-drawable-canvas torchmetrics \
+    # 安装暗水印
+    && pip install --no-cache-dir invisible-watermark==0.1.5 gdown==4.5.3 ftfy==6.1.1 modelcards==0.1.6 \
+    # 安装openmm相关包
+    && pip install--no-cache-dir -U openmim \
+    && mim install mmengine \
+    && mim install "mmcv>=2.0.1" \
+    && mim install "mmdet>=3.1.0" \
+    && mim install "mmpose>=1.1.0" \
+    # jupyters
+    && pip install ipywidgets==8.0.3 \
+    && python -m ipykernel install --user --name projectv --display-name "python(projectv)" \
+    && pip install --no-cache-dir matplotlib==3.6.2 redis==4.5.1  pydantic[dotenv]==1.10.2 loguru==0.6.0 IProgress==0.4 \
+    && pip install --no-cache-dir  cos-python-sdk-v5==1.9.22 coscmd==1.8.6.30 \
+    # 必须放在最后pip，避免和jupyter的不兼容
+    && pip install --no-cache-dir  markupsafe==2.0.1 \
+    && wget -P /tmp $GENERIC_REPO_URL/cpu/clean-layer.sh \
+    && sh /tmp/clean-layer.sh
+ENV LD_LIBRARY_PATH=/usr/local/lib64:$LD_LIBRARY_PATH
+USER root

MuseV/MMCM/README.md ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ # MMCM
2	+ Process package for multi media, cross multi modal.

MuseV/MMCM/mmcm/__init__.py ADDED Viewed

	@@ -0,0 +1,6 @@

+from .audio import *
+from .data import *
+from .music import *
+from .text import *
+from .vision import *
+from .t2p import *

MuseV/MMCM/mmcm/audio/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/__init__.py ADDED Viewed

	@@ -0,0 +1,9 @@

+from .general.items import Items, Item
+from .emb.emb import MediaMapEmb
+from .emb.h5py_emb import H5pyMediaMapEmb, H5pyMediaMapEmbProxy
+from .media_map.media_map import MediaMap, MetaInfo, MetaInfoList, MediaMapSeq
+from .media_map.media_map_process import get_sub_mediamap_by_clip_idx, get_sub_mediamap_by_stage, get_subseq_by_time
+from .clip.clip import Clip, ClipSeq
+from .clip.clipid import ClipIds, ClipIdsSeq, MatchedClipIds, MatchedClipIdsSeq

MuseV/MMCM/mmcm/data/clip.py ADDED Viewed

	@@ -0,0 +1,324 @@

+from copy import deepcopy
+from typing import Iterable
+import logging
+import numpy as np
+from ..utils.util import convert_class_attr_to_dict
+logger = logging.getLogger(__name__)  # pylint: disable=invalid-name
+class Clip(object, Item):
+    """媒体片段, 指转场点与转场点之间的部分"""
+    def __init__(
+        self,
+        time_start,
+        duration,
+        clipid=None,
+        media_type=None,
+        mediaid=None,
+        timepoint_type=None,
+        text=None,
+        stage=None,
+        path=None,
+        duration_num=None,
+        group_time_start=0,
+        group_clipid=None,
+        original_clipid=None,
+        emb=None,
+        multi_factor=None,
+        similar_clipseq=None,
+        rythm: float = None,
+        **kwargs
+    ):
+        """
+        Args:
+            time_start (float): 开始时间,秒为单位,对应该媒体文件的, 和media_map.json上的序号一一对应
+            duration (_type_): 片段持续时间
+            clipid (int, or [int]): 由media_map提供的片段序号, 和media_map.json上的序号一一对应
+            media_type (str, optional): music, video,text, Defaults to None.
+            mediaid (int): 多媒体id, 当clipid是列表时,表示该片段是个融合片段
+            timepoint_type(int, ): 开始点的转场类型. Defaults to None.
+            text(str, optional): 该片段的文本描述,音乐可以是歌词,视频可以是台词,甚至可以是弹幕. Defaults to None.
+            stage(str, optional): 该片段在整个媒体文件中的结构位置,如音乐的intro、chrous、vesa,视频的片头、片尾、开始、高潮、转场等. Defaults to None.
+            path (_type_, optional): 该媒体文件的路径,用于后续媒体读取、处理. Defaults to None.
+            duration_num (_type_, optional): 片段持续帧数, Defaults to None.
+            group_time_start (int, optional): 当多歌曲、多视频剪辑时,group_time_start 表示该片段所对应的子媒体前所有子媒体的片段时长总和。
+                默认0, 表示只有1个媒体文件. Defaults to 0.
+            group_clipid (int, optional):  # MediaInfo.sub_meta_info 中的实际序号.
+            original_clipid (None or [int], optional): 有些片段由其他片段合并,该字段用于片段来源,id是 media_map.json 中的实际序号. Defaults to None.
+            emb (np.array, optional): 片段 综合emb,. Defaults to None.
+            multi_factor (MultiFactorFeature), optional): 多维度特征. Defaults to None.
+            similar_clipseq ([Clip]], optional): 与该片段相似的片段，具体结构待定义. Defaults to None.
+        """
+        self.media_type = media_type
+        self.mediaid = mediaid
+        self.time_start = time_start
+        self.duration = duration
+        self.clipid = clipid
+        self.path = path
+        self.timepoint_type = timepoint_type
+        self.text = text
+        self.stage = stage
+        self.group_time_start = group_time_start
+        self.group_clipid = group_clipid
+        self.duration_num = duration_num
+        self.original_clipid = original_clipid if original_clipid is not None else []
+        self.emb = emb
+        self.multi_factor = multi_factor
+        self.similar_clipseq = similar_clipseq
+        self.rythm = rythm
+        # TODO: 目前谱面中会有一些不必要的中间结果，比较占内存，现在代码里删掉，待后续数据协议确定
+        kwargs = {k: v for k, v in kwargs.items()}
+        self.__dict__.update(kwargs)
+        self.preprocess()
+    def preprocess(self):
+        pass
+    def spread_parameters(self):
+        pass
+    @property
+    def time_end(
+        self,
+    ):
+        return self.time_start + self.duration
+    @property
+    def mvp_clip(self):
+        """读取实际的片段数据为moviepy格式
+        Raises:
+            NotImplementedError: _description_
+        """
+        raise NotImplementedError
+class ClipSeq(object):
+    """媒体片段序列"""
+    ClipClass = Clip
+    def __init__(self, clips) -> None:
+        """_summary_
+        Args:
+            clips ([Clip]]): 媒体片段序列
+        """
+        if not isinstance(clips, list):
+            clips = [clips]
+        if len(clips) == 0:
+            self.clips = []
+        elif isinstance(clips[0], dict):
+            self.clips = [self.ClipClass(**d) for d in clips]
+        else:
+            self.clips = clips
+    def set_clip_value(self, k, v):
+        """给序列中的每一个clip 赋值"""
+        for i in range(len(self.clips)):
+            self.clips[i].__setattr__(k, v)
+    def __len__(
+        self,
+    ):
+        return len(self.clips)
+    def merge(self, other, group_time_start_delta=None, groupid_delta=None):
+        """融合其他ClipSeq。media_info 融合时需要记录 clip 所在的 groupid 和 group_time_start，delta用于表示变化
+        Args:
+            other (ClipSeq): 待融合的ClipSeq
+            group_time_start_delta (float, optional): . Defaults to None.
+            groupid_delta (int, optional): _description_. Defaults to None.
+        """
+        if group_time_start_delta is not None or groupid_delta is not None:
+            for i, clip in enumerate(other):
+                if group_time_start_delta is not None:
+                    clip.group_time_start += group_time_start_delta
+                if groupid_delta is not None:
+                    clip.groupid += groupid_delta
+        self.clips.extend(other.clips)
+        for i in range(len(self.clips)):
+            self.clips[i].group_clipid = i
+    @property
+    def duration(
+        self,
+    ):
+        """Clip.duration的和
+        Returns:
+            float: 序列总时长
+        """
+        if len(self.clips) == 0:
+            return 0
+        else:
+            return sum([c.duration for c in self.clips])
+    def __getitem__(self, i) -> Clip:
+        """支持索引和切片操作，如果输入是整数则返回Clip，如果是切片，则返回ClipSeq
+        Args:
+            i (int or slice): 索引
+        Raises:
+            ValueError: 需要按照给的输入类型索引
+        Returns:
+            Clip or ClipSeq:
+        """
+        if "int" in str(type(i)):
+            i = int(i)
+        if isinstance(i, int):
+            clip = self.clips[i]
+            return clip
+        elif isinstance(i, Iterable):
+            clips = [self.__getitem__(x) for x in i]
+            clipseq = ClipSeq(clips)
+            return clipseq
+        elif isinstance(i, slice):
+            if i.step is None:
+                step = 1
+            else:
+                step = i.step
+            clips = [self.__getitem__(x) for x in range(i.start, i.stop, step)]
+            clipseq = ClipSeq(clips)
+            return clipseq
+        else:
+            raise ValueError(
+                "unsupported input, should be int or slice, but given {}, type={}".format(
+                    i, type(i)
+                )
+            )
+    def insert(self, idx, obj):
+        self.clips.insert(idx, obj)
+    def append(self, obj):
+        self.clips.append(obj)
+    def extend(self, objs):
+        self.clips.extend(objs)
+    @property
+    def duration_seq_emb(
+        self,
+    ):
+        emb = np.array([c.duration for c in self.clips])
+        return emb
+    @property
+    def timestamp_seq_emb(self):
+        emb = np.array([c.time_start for c in self.clips])
+        return emb
+    @property
+    def rela_timestamp_seq_emb(self):
+        emb = self.timestamp_seq_emb / self.duration
+        return emb
+    def get_factor_seq_emb(self, factor, dim):
+        emb = []
+        for c in self.clips:
+            if factor not in c.multi_factor or c.multi_factor[factor] is None:
+                v = np.full(dim, np.inf)
+            else:
+                v = c.multi_factor[factor]
+            emb.append(v)
+        emb = np.stack(emb, axis=0)
+        return emb
+    def semantic_seq_emb(self, dim):
+        return self.get_factor_seq_emb(factor="semantics", dim=dim)
+    def emotion_seq_emb(self, dim):
+        return self.get_factor_seq_emb(factor="emotion", dim=dim)
+    def theme_seq_emb(self, dim):
+        return self.get_factor_seq_emb(factor="theme", dim=dim)
+    def to_dct(
+        self,
+        target_keys=None,
+        ignored_keys=None,
+    ):
+        if ignored_keys is None:
+            ignored_keys = ["kwargs", "audio_path", "lyric_path", "start", "end"]
+        clips = [
+            clip.to_dct(target_keys=target_keys, ignored_keys=ignored_keys)
+            for clip in self.clips
+        ]
+        return clips
+    @property
+    def mvp_clip(self):
+        """读取实际的片段数据为moviepy格式
+        Raises:
+            NotImplementedError: _description_
+        """
+        raise NotImplementedError
+class ClipIds(object):
+    def __init__(
+        self,
+        clipids: list or int,
+    ) -> None:
+        """ClipSeq 中的 Clip序号，主要用于多个 Clip 融合后的 Clip, 使用场景如
+        1. 一个 MusicClip 可以匹配到多个 VideoClip，VideoClip 的索引便可以使用 ClipIds 定义。
+        Args:
+            clipids (list or int): ClipSeq 中的序号
+        """
+        self.clipids = clipids if isinstance(clipids, list) else [clipids]
+class ClipIdsSeq(object):
+    def __init__(self, clipids_seq: list) -> None:
+        """多个 ClipIds，使用场景可以是
+        1. 将MediaClipSeq 进行重组，拆分重组成更粗粒度的ClipSeq；
+        Args:
+            clipids_seq (list): 组合后的 ClipIds 列表
+        """
+        self.clipids_seq = (
+            clipids_seq if isinstance(clipids_seq, ClipIds) else [clipids_seq]
+        )
+# TODO: metric后续可能是字典
+class MatchedClipIds(object):
+    def __init__(
+        self, id1: ClipIds, id2: ClipIds, metric: float = None, **kwargs
+    ) -> None:
+        """两种模态数据的片段匹配对，���用场景 可以是
+        1. 音乐片段和视频片段 之间的匹配关系，
+        Args:
+            id1 (ClipIds): 第一种模态的片段
+            id2 (ClipIds): 第二种模态的片段
+            metric (float): 匹配度量距离
+        """
+        self.id1 = id1 if isinstance(id1, ClipIds) else ClipIds(id1)
+        self.id2 = id2 if isinstance(id2, ClipIds) else ClipIds(id2)
+        self.metric = metric
+        self.__dict__.update(**kwargs)
+class MatchedClipIdsSeq(object):
+    def __init__(self, seq: list, metric: float = None, **kwargs) -> None:
+        """两种模态数据的序列匹配对，使用场景可以是
+        1. 音乐片段序列和视频片段序列 之间的匹配，每一个元素都是MatchedClipIds:
+        Args:
+            seq (list): 两种模态数据的序列匹配对列表
+            metric (float): 匹配度量距离
+        """
+        self.seq = seq
+        self.metric = metric
+        self.__dict__.update(**kwargs)

MuseV/MMCM/mmcm/data/clip/__init__.py ADDED Viewed

	@@ -0,0 +1,5 @@

+from .clip import Clip, ClipSeq
+from .clipid import ClipIds, MatchedClipIds, ClipIdsSeq, MatchedClipIdsSeq
+from .clip_process import find_idx_by_time, find_idx_by_clip, get_subseq_by_time, get_subseq_by_idx, clip_is_top, clip_is_middle, clip_is_end, abadon_old_return_new, reset_clipseq_id, insert_endclip, insert_startclip, drop_start_end_by_time, complete_clipseq, complete_gap
+from .clip_stat import stat_clipseq_duration
+from .clip_filter import ClipFilter, ClipSeqFilter

MuseV/MMCM/mmcm/data/clip/clip.py ADDED Viewed

	@@ -0,0 +1,197 @@

+from __future__ import annotations
+from copy import deepcopy
+from typing import Iterable, List, Tuple, Dict, Hashable, Any, Union
+import numpy as np
+from ...utils.util import convert_class_attr_to_dict
+from ..general.items import Items, Item
+from .clipid import MatchedClipIds
+import logging
+logger = logging.getLogger(__name__)  # pylint: disable=invalid-name
+__all__ = ["Clip", "ClipSeq"]
+class Clip(Item):
+    """媒体片段, 指转场点与转场点之间的部分"""
+    def __init__(
+        self,
+        time_start: float,
+        duration: float,
+        clipid: int = None,
+        media_type: str = None,
+        mediaid: str = None,
+        timepoint_type: str = None,
+        text: str = None,
+        stage: str = None,
+        path: str = None,
+        duration_num: int = None,
+        similar_clipseq: MatchedClipIds = None,
+        dynamic: float = None,
+        **kwargs,
+    ):
+        """
+        Args:
+            time_start (float): 开始时间,秒为单位,对应该媒体文件的, 和media_map.json上的序号一一对应
+            duration (_type_): 片段持续时间
+            clipid (int, or [int]): 由media_map提供的片段序号, 和media_map.json上的序号一一对应
+            media_type (str, optional): music, video,text, Defaults to None.
+            mediaid (int): 多媒体id, 当clipid是列表时,表示该片段是个融合片段
+            timepoint_type(int, ): 开始点的转场类型. Defaults to None.
+            text(str, optional): 该片段的文本描述,音乐可以是歌词,视频可以是台词,甚至可以是弹幕. Defaults to None.
+            stage(str, optional): 该片段在整个媒体文件中的结构位置,如音乐的intro、chrous、vesa,视频的片头、片尾、开始、高潮、转场等. Defaults to None.
+            path (str, optional): 该媒体文件的路径,用于后续媒体读取、处理. Defaults to None.
+            duration_num (_type_, optional): 片段持续帧数, Defaults to None.
+            similar_clipseq ([Clip]], optional): 与该片段相似的片段，具体结构待定义. Defaults to None.
+        """
+        self.media_type = media_type
+        self.mediaid = mediaid
+        self.time_start = time_start
+        self.duration = duration
+        self.clipid = clipid
+        self.path = path
+        self.timepoint_type = timepoint_type
+        self.text = text
+        self.stage = stage
+        self.duration_num = duration_num
+        self.similar_clipseq = similar_clipseq
+        self.dynamic = dynamic
+        self.__dict__.update(**kwargs)
+    def preprocess(self):
+        pass
+    def spread_parameters(self):
+        pass
+    @property
+    def time_end(
+        self,
+    ) -> float:
+        return self.time_start + self.duration
+    def get_emb(self, key: str, idx: int) -> np.float:
+        return self.emb.get_value(key, idx)
+class ClipSeq(Items):
+    """媒体片段序列"""
+    def __init__(self, items: List[Clip] = None):
+        super().__init__(items)
+        self.clipseq = self.data
+    def preprocess(self):
+        pass
+    def set_clip_value(self, k: Hashable, v: Any) -> None:
+        """给序列中的每一个clip 赋值"""
+        for i in range(len(self.clipseq)):
+            self.clipseq[i].__setattr__(k, v)
+    def __len__(
+        self,
+    ) -> int:
+        return len(self.clipseq)
+    @property
+    def duration(
+        self,
+    ) -> float:
+        """Clip.duration的和
+        Returns:
+            float: 序列总时长
+        """
+        if len(self.clipseq) == 0:
+            return 0
+        else:
+            return sum([c.duration for c in self.clipseq])
+    def __getitem__(self, i: Union[int, Iterable]) -> Union[Clip, ClipSeq]:
+        """支持索引和切片操作，如果输入是整数则返回Clip，如果是切片，则返回ClipSeq
+        Args:
+            i (int or slice): 索引
+        Raises:
+            ValueError: 需要按照给的输入类型索引
+        Returns:
+            Clip or ClipSeq:
+        """
+        if "int" in str(type(i)):
+            i = int(i)
+        if isinstance(i, int):
+            clip = self.clipseq[i]
+            return clip
+        elif isinstance(i, Iterable):
+            clipseq = [self.__getitem__(x) for x in i]
+            clipseq = ClipSeq(clipseq)
+            return clipseq
+        elif isinstance(i, slice):
+            if i.step is None:
+                step = 1
+            else:
+                step = i.step
+            clipseq = [self.__getitem__(x) for x in range(i.start, i.stop, step)]
+            clipseq = ClipSeq(clipseq)
+            return clipseq
+        else:
+            raise ValueError(
+                "unsupported input, should be int or slice, but given {}, type={}".format(
+                    i, type(i)
+                )
+            )
+    @property
+    def mvp_clip(self):
+        """读取实际的片段数据为moviepy格式
+        Raises:
+            NotImplementedError: _description_
+        """
+        raise NotImplementedError
+    @property
+    def duration_seq_emb(
+        self,
+    ) -> np.array:
+        emb = np.array([c.duration for c in self.clipseq])
+        return emb
+    @property
+    def timestamp_seq_emb(self) -> np.array:
+        emb = np.array([c.time_start for c in self.clipseq])
+        return emb
+    @property
+    def rela_timestamp_seq_emb(self) -> np.array:
+        duration_seq = [c.duration for c in self.clipseq]
+        emb = np.cumsum(duration_seq) / self.duration
+        return emb
+    def get_emb(self, key: str, idx: int) -> np.float:
+        clip_start_idx = self.clipseq[0].clipid
+        clip_end_idx = self.clipseq[-1].clipid
+        # TODO: 待修改为更通用的形式
+        if idx is None:
+            idx = range(clip_start_idx, clip_end_idx + 1)
+        elif isinstance(idx, int):
+            idx += clip_start_idx
+        elif isinstance(idx, Iterable):
+            idx = [x + clip_start_idx for x in idx]
+        else:
+            raise ValueError(
+                f"idx only support None, int, Iterable, but given {idx},type is {type(idx)}"
+            )
+        return self.emb.get_value(key, idx=idx)

MuseV/MMCM/mmcm/data/clip/clip_filter.py ADDED Viewed

	@@ -0,0 +1,46 @@

+from typing import Callable, List, Union
+from .clip import ClipSeq
+from .clip_process import reset_clipseq_id
+class ClipFilter(object):
+    """clip滤波器，判断 Clip 是否符合标准
+    Args:
+        object (bool): 是否符合输入函数
+    """
+    def __init__(self, funcs: Union[Callable, List[Callable]], logic_func: Callable=all) -> None:
+        """多个 clip 判断函数，通过 逻辑与、或当综合结果。
+        Args:
+            funcs (list of func): 列表判断函数
+            logic_func (func, optional): all or any. Defaults to all.
+        """
+        self.funcs = funcs if isinstance(funcs, list) else [funcs]
+        self.logic_func = logic_func
+    def __call__(self, clip) -> bool:
+        flag = [func(clip) for func in self.funcs]
+        flag = self.logic_func(flag)
+        return flag
+# TODO
+class ClipSeqFilter(object):
+    def __init__(self, filter: Callable) -> None:
+        self.filter = filter
+    def __call__(self, clipseq: ClipSeq) -> ClipSeq:
+        new_clipseq = []
+        n_clipseq = len(clipseq)
+        for i in range(n_clipseq):
+            clip = clipseq[i]
+            if self.filter(clip):
+                new_clipseq.append(clip)
+        new_clipseq = reset_clipseq_id(new_clipseq)
+        # logger.debug("ClipSeqFilter: clipseq length before={}, after={}".format(n_clipseq, len(new_clipseq)))
+        return new_clipseq

MuseV/MMCM/mmcm/data/clip/clip_fusion.py ADDED Viewed

	@@ -0,0 +1,64 @@

+from typing import List, Union, Callable
+from copy import deepcopy
+from .clip import ClipSeq
+from .clip_process import reset_clipseq_id
+import logging
+logger = logging.getLogger(__name__)  # pylint: disable=invalid-name
+# TODO: 不同类型的clip需要不同的融合方式
+def fuse_clips(s1: ClipSeq, s2: ClipSeq) -> ClipSeq:
+    """合并2个clip
+    Args:
+        s1 (Clip):
+        s2 (Clip):
+    Returns:
+        Clip: 合并后Clip
+    """
+    if not isinstance(s2, list):
+        s2 = [s2]
+    s1 = deepcopy(s1)
+    for other_clip in s2:
+        s1.duration += other_clip.duration
+        if s1.stage is not None and other_clip.stage is not None:
+            # TODO：如何保留融合的clip信息
+            s1.stage = "{}_{}".format(s1.stage, other_clip.stage)
+            s1.origin_clipid.extend(other_clip.origin_clipid)
+        if s1.timepoint_type is not None and other_clip.timepoint_type is not None:
+            s1.timepoint_type = "{}_{}".format(
+                s1.timepoint_type, other_clip.timepoint_type
+            )
+    return s1
+# TODO: 不同的filter和fusion函数不适用同一种流程，待优化
+class ClipSeqFusion(object):
+    """_summary_
+    Args:
+        object (_type_): _description_
+    """
+    def __init__(self, filter: Callable, fuse_func: Callable = None) -> None:
+        self.filter = filter
+        self.fuse_func = fuse_func
+    def __call__(self, clipseq: ClipSeq) -> ClipSeq:
+        new_clipseq = []
+        n_clipseq = len(clipseq)
+        for i in range(n_clipseq):
+            clip = clipseq[i]
+            if self.filter(clip):
+                new_clipseq.append(clip)
+        new_clipseq = reset_clipseq_id(new_clipseq)
+        logger.debug(
+            "ClipSeqFilter: clipseq length before={}, after={}".format(
+                n_clipseq, len(new_clipseq)
+            )
+        )
+        return new_clipseq

MuseV/MMCM/mmcm/data/clip/clip_process.py ADDED Viewed

	@@ -0,0 +1,366 @@

+from functools import partial
+from copy import deepcopy
+from typing import Iterable, List, Tuple, Union
+import bisect
+import logging
+import numpy as np
+from .clip import Clip, ClipSeq
+from .clipid import ClipIds, ClipIdsSeq, MatchedClipIds, MatchedClipIdsSeq
+logger = logging.getLogger(__name__)  # pylint: disable=invalid-name
+__all__ = [
+    "find_idx_by_rela_time",
+    "find_idx_by_time",
+    "find_idx_by_clip",
+    "get_subseq_by_time",
+    "get_subseq_by_idx",
+    "clip_is_top",
+    "clip_is_middle",
+    "clip_is_end",
+    "abadon_old_return_new",
+    "reset_clipseq_id",
+    "insert_endclip",
+    "insert_startclip",
+    "drop_start_end_by_time",
+    "complete_clipseq",
+    "complete_gap",
+    "get_subseq_by_stages",
+    "find_time_by_stage",
+]
+def find_idx_by_rela_time(clipseq: ClipSeq, timepoint: float) -> int:
+    clipseq_duration = clipseq.duration
+    timepoint = clipseq_duration * timepoint
+    clipseq_times = [c.duration for c in clipseq]
+    clipseq_times.insert(0, 0)
+    clipseq_times = np.cumsum(clipseq_times)
+    idx = bisect.bisect_right(clipseq_times, timepoint)
+    idx = min(max(0, idx - 1), len(clipseq) - 1)
+    return idx
+def find_idx_by_time(clipseq: ClipSeq, timepoint: float) -> int:
+    """寻找指定时间timepoint 在 clipseq 中的片段位置
+    Args:
+        clipseq (ClipSeq): 待寻找的片段序列
+        timepoint (float): 指定时间位置
+    Returns:
+        _type_: _description_
+    """
+    clipseq_times = [c.time_start for c in clipseq]
+    idx = bisect.bisect_right(clipseq_times, timepoint)
+    idx = min(max(0, idx - 1), len(clipseq) - 1)
+    return idx
+def find_idx_by_clip(clipseq: ClipSeq, clip: Clip, eps: float = 1e-4) -> int:
+    """通过计算目标clip和clipseq中所有候选clip的交集占比来找最近clip
+    Args:
+        clipseq (ClipSeq): 候选clip序列
+        clip (Clip): 目标clip
+        eps (float, optional): 最小交集占比. Defaults to 1e-4.
+    Returns:
+        int: 目标clip在候选clip序列的位置，若无则为None
+    """
+    timepoints = np.array([[c.time_start, c.time_start + c.duration] for c in clipseq])
+    clip_time_start = clip.time_start
+    clip_duraiton = clip.duration
+    clip_time_end = clip_time_start + clip_duraiton
+    max_time_start = np.maximum(timepoints[:, 0], clip_time_start)
+    min_time_end = np.minimum(timepoints[:, 1], clip_time_end)
+    intersection = min_time_end - max_time_start
+    intersection_ratio = intersection / clip_duraiton
+    max_intersection_ratio = np.max(intersection_ratio)
+    idx = np.argmax(intersection_ratio) if max_intersection_ratio > eps else None
+    return idx
+def get_subseq_by_time(
+    clipseq: ClipSeq,
+    start: float = 0,
+    duration: float = None,
+    end: float = 1,
+    eps: float = 1e-2,
+) -> ClipSeq:
+    """根据时间对媒体整体做掐头去尾，保留中间部分。，也可以是大于1的数。
+        start和end如果是0-1的小数，则认为是是相对时间位置，实际位置会乘以duration；
+        start和end如果是大于1的数，则是绝对时间位置。
+    Args:
+        clipseq (ClipSeq): 待处理的序列
+        start (float,): 保留部分的开始，. Defaults to 0.
+        duration (float, optional): 媒体文件当前总时长
+        end (float, optional): 保留部分的结尾. Defaults to 1.
+    Returns:
+        ClipSeq: 处理后的序列
+    """
+    if (start == 0 or start is None) and (end is None or end == 1):
+        logger.warning("you should set start or end")
+        return clipseq
+    if duration is None:
+        duration = clipseq.duration
+    if start is None or start == 0:
+        clip_start_idx = 0
+    else:
+        if start < 1:
+            start = start * duration
+        clip_start_idx = find_idx_by_time(clipseq, start)
+    if end is None or end == 1 or np.abs(duration - end) < eps:
+        clip_end_idx = -1
+    else:
+        if end < 1:
+            end = end * duration
+        clip_end_idx = find_idx_by_time(clipseq, end)
+    if clip_end_idx != -1 and clip_start_idx >= clip_end_idx:
+        logger.error(
+            f"clip_end_idx({clip_end_idx}) should be > clip_start_idx({clip_start_idx})"
+        )
+    subseq = get_subseq_by_idx(clipseq, clip_start_idx, clip_end_idx)
+    return subseq
+def get_subseq_by_idx(clipseq: ClipSeq, start: int = None, end: int = None) -> ClipSeq:
+    """通过指定索引范围，切片子序列
+    Args:
+        clipseq (ClipSeq):
+        start (int, optional): 开始索引. Defaults to None.
+        end (int, optional): 结尾索引. Defaults to None.
+    Returns:
+        _type_: _description_
+    """
+    if start is None and end is None:
+        return clipseq
+    if start is None:
+        start = 0
+    if end is None:
+        end = len(clipseq)
+    return clipseq[start:end]
+def clip_is_top(clip: Clip, total: float, th: float = 0.1) -> bool:
+    """判断Clip是否属于开始部分
+    Args:
+        clip (Clip):
+        total (float): 所在ClipSeq总时长
+        th (float, optional): 开始范围的截止位置. Defaults to 0.05.
+    Returns:
+        Bool: 是不是头部Clip
+    """
+    clip_time = clip.time_start
+    if clip_time / total <= th:
+        return True
+    else:
+        return False
+def clip_is_end(clip: Clip, total: float, th: float = 0.9) -> bool:
+    """判断Clip是否属于结尾部分
+    Args:
+        clip (Clip):
+        total (float): 所在ClipSeq总时长
+        th (float, optional): 结尾范围的开始位置. Defaults to 0.9.
+    Returns:
+        Bool: 是不是尾部Clip
+    """
+    clip_time = clip.time_start + clip.duration
+    if clip_time / total >= th:
+        return True
+    else:
+        return False
+def clip_is_middle(
+    clip: Clip, total: float, start: float = 0.05, end: float = 0.9
+) -> bool:
+    """判断Clip是否属于中间部分
+    Args:
+        clip (Clip):
+        total (float): 所在ClipSeq总时长
+        start (float, optional): 中间范围的开始位置. Defaults to 0.05.
+        start (float, optional): 中间范围的截止位置. Defaults to 0.9.
+    Returns:
+        Bool: 是不是中间Clip
+    """
+    if start >= 0 and start < 1:
+        start = total * start
+    if end > 0 and end <= 1:
+        end = total * end
+    clip_time_start = clip.time_start
+    clip_time_end = clip.time_start + clip.duration
+    if (clip_time_start >= start) and (clip_time_end <= end):
+        return True
+    else:
+        return False
+def abadon_old_return_new(s1: Clip, s2: Clip) -> Clip:
+    """特殊的融合方式
+    Args:
+        s1 (Clip): 靠前的clip
+        s2 (Clip): 靠后的clip
+    Returns:
+        Clip: 融合后的Clip
+    """
+    return s2
+# TODO：待确认是否要更新clipid，不方便对比着json进行debug
+def reset_clipseq_id(clipseq: ClipSeq) -> ClipSeq:
+    for i in range(len(clipseq)):
+        if isinstance(clipseq[i], dict):
+            clipseq[i]["clipid"] = i
+        else:
+            clipseq[i].clipid = i
+    return clipseq
+def insert_startclip(clipseq: ClipSeq) -> ClipSeq:
+    """给ClipSeq插入一个开始片段。
+    Args:
+        clipseq (ClipSeq):
+        clip_class (Clip, optional): 插入的Clip类型. Defaults to Clip.
+    Returns:
+        ClipSeq: 插入头部Clip的新ClipSeq
+    """
+    if clipseq[0].time_start > 0:
+        start = clipseq.ClipClass(
+            time_start=0, duration=round(clipseq[0].time_start, 3), timepoint_type=0
+        )
+        clipseq.insert(0, start)
+    clipseq = reset_clipseq_id(clipseq)
+    return clipseq
+def insert_endclip(clipseq: ClipSeq, duration: float) -> ClipSeq:
+    """给ClipSeq插入一个尾部片段。
+    Args:
+        clipseq (ClipSeq):
+        duration(float, ): 序列的总时长
+        clip_class (Clip, optional): 插入的Clip类型. Defaults to Clip.
+    Returns:
+        ClipSeq: 插入尾部Clip的新ClipSeq
+    """
+    clipseq_endtime = clipseq[-1].time_start + clipseq[-1].duration
+    if duration - clipseq_endtime > 1:
+        end = clipseq.ClipClass(
+            time_start=round(clipseq_endtime, 3),
+            duration=round(duration - clipseq_endtime, 3),
+            timepoint_type=0,
+        )
+        clipseq.append(end)
+    clipseq = reset_clipseq_id(clipseq)
+    return clipseq
+def drop_start_end_by_time(
+    clipseq: ClipSeq, start: float, end: float, duration: float = None
+):
+    return get_subseq_by_time(clipseq=clipseq, start=start, end=end, duration=duration)
+def complete_clipseq(
+    clipseq: ClipSeq, duration: float = None, gap_th: float = 2
+) -> ClipSeq:
+    """绝大多数需要clipseq中的时间信息是连续、完备的，有时候是空的，需要补足的部分。
+    如歌词时间戳生成的music_map缺头少尾、中间有空的部分。
+    Args:
+        clipseq (ClipSeq): 待补集的序列
+        duration (float, optional): 整个序列持续时间. Defaults to None.
+        gap_th (float, optional): 有时候中间空隙过短就会被融合到上一个片段中. Defaults to 2.
+    Returns:
+        ClipSeq: 补集后的序列，时间连续、完备。
+    """
+    if isinstance(clipseq, list):
+        clipseq = ClipSeq(clipseq)
+        return complete_clipseq(clipseq=clipseq, duration=duration, gap_th=gap_th)
+    clipseq = complete_gap(clipseq, th=gap_th)
+    clipseq = insert_startclip(clipseq)
+    if duration is not None:
+        clipseq = insert_endclip(clipseq, duration)
+    return clipseq
+def complete_gap(clipseq: ClipSeq, th: float = 2) -> ClipSeq:
+    """generate blank clip timepoint = 0，如果空白时间过短，则空白附到上一个歌词片段中。
+    Args:
+        clipseq (ClipSeq): 原始的歌词生成的MusicClipSeq
+        th (float, optional): 有时候中间空隙过短就会被融合到上一个片段中. Defaults to 2.
+    Returns:
+        ClipSeq: 补全后的
+    """
+    gap_clipseq = []
+    clipid = 0
+    for i in range(len(clipseq) - 1):
+        time_start = clipseq[i].time_start
+        duration = clipseq[i].duration
+        time_end = time_start + duration
+        next_time_start = clipseq[i + 1].time_start
+        time_diff = next_time_start - time_end
+        if time_diff >= th:
+            blank_clip = clipseq.ClipClass(
+                time_start=time_end,
+                duration=time_diff,
+                timepoint_type=0,
+                clipid=clipid,
+            )
+            gap_clipseq.append(blank_clip)
+            clipid += 1
+        else:
+            clipseq[i].duration = next_time_start - time_start
+    clipseq.extend(gap_clipseq)
+    clipseq.clips = sorted(clipseq.clips, key=lambda clip: clip.time_start)
+    reset_clipseq_id(clipseq)
+    return clipseq
+def find_time_by_stage(
+    clipseq: ClipSeq, stages: Union[str, List[str]] = None
+) -> Tuple[float, float]:
+    if isinstance(stages, list):
+        stages = [stages]
+    for clip in clipseq:
+        if clip.stage in stages:
+            return clip.time_start, clip.time_end
+    return None, None
+def get_subseq_by_stages(clipseq: ClipSeq, stages: Union[str, List[str]]) -> ClipSeq:
+    if isinstance(stages, List):
+        stages = [stages]
+    start, _ = find_time_by_stage(clipseq, stages[0])
+    _, end = find_time_by_stage(clipseq, stages[-1])
+    if start1 is None:
+        start1 = 0
+    if end2 is None:
+        end2 = clipseq.duration
+    subseq = get_subseq_by_time(clipseq=clipseq, start=start, end=end)
+    return subseq

MuseV/MMCM/mmcm/data/clip/clip_stat.py ADDED Viewed

	@@ -0,0 +1,13 @@

+from typing import Tuple
+import numpy as np
+from .clip import ClipSeq
+def stat_clipseq_duration(
+    clipseq: ClipSeq,
+) -> Tuple[np.array, np.array]:
+    clip_duration = [clip.duration for clip in clipseq]
+    (hist, bin_edges) = np.histogram(clip_duration)
+    return hist, bin_edges

MuseV/MMCM/mmcm/data/clip/clipid.py ADDED Viewed

	@@ -0,0 +1,70 @@

+from __future__ import annotations
+from typing import Union, List
+__all__ = [
+    "ClipIds",
+    "ClipIdsSeq",
+    "MatchedClipIds",
+    "MatchedClipIdsSeq",
+]
+class ClipIds(object):
+    def __init__(
+        self,
+        clipids: Union[int, List[int]],
+    ) -> None:
+        """ClipSeq 中的 Clip序号，主要用于多个 Clip 融合后的 Clip, 使用场景如
+        1. 一个 MusicClip 可以匹配到多个 VideoClip，VideoClip 的索引便可以使用 ClipIds 定义。
+        Args:
+            clipids (list or int): ClipSeq 中的序号
+        """
+        self.clipids = clipids if isinstance(clipids, list) else [clipids]
+class ClipIdsSeq(object):
+    def __init__(self, clipids_seq: List[ClipIds]) -> None:
+        """多个 ClipIds，使用场景可以是
+        1. 将MediaClipSeq 进行重组，拆分重组成更粗粒度的ClipSeq；
+        Args:
+            clipids_seq (list): 组合后的 ClipIds 列表
+        """
+        self.clipids_seq = (
+            clipids_seq if isinstance(clipids_seq, ClipIds) else [clipids_seq]
+        )
+# TODO: metric后续可能是字典
+class MatchedClipIds(object):
+    def __init__(
+        self, id1: ClipIds, id2: ClipIds, metric: float = None, **kwargs
+    ) -> None:
+        """两种模态数据的片段匹配对，使用场景 可以是
+        1. 音乐片段和视频片段 之间的匹配关系，
+        Args:
+            id1 (ClipIds): 第一种模态的片段
+            id2 (ClipIds): 第二种模态的片段
+            metric (float): 匹配度量距离
+        """
+        self.id1 = id1 if isinstance(id1, ClipIds) else ClipIds(id1)
+        self.id2 = id2 if isinstance(id2, ClipIds) else ClipIds(id2)
+        self.metric = metric
+        self.__dict__.update(**kwargs)
+class MatchedClipIdsSeq(object):
+    def __init__(self, seq: List[MatchedClipIds], metric: float = None, **kwargs) -> None:
+        """两种模态数据的序列匹配对，使用场景可以是
+        1. 音乐片段序列和视频片段序列 之间的匹配，每一个元素都是MatchedClipIds:
+        Args:
+            seq (list): 两种模态数据的序列匹配对列表
+            metric (float): 匹配度量距离
+        """
+        self.seq = seq
+        self.metric = metric
+        self.__dict__.update(**kwargs)

MuseV/MMCM/mmcm/data/crawl/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/crawl/download.py ADDED Viewed

	@@ -0,0 +1,72 @@

+from collections import namedtuple
+from typing import NamedTuple, Tuple, List
+import logging
+import os
+import numpy as np
+import subprocess
+import requests
+import wget
+from .youtube import download_youtube
+from .flicker import download_flickr
+from .ffmpeg import ffmpeg_load
+logger = logging.getLogger(__name__)
+# DownloadStatus  = namedtuple("DownloadStatus", ["status_code", "msg"])
+status_code = {0: "download: succ",
+              -1: "download: failed",
+              -2: "clip: failed",
+              -3: "directory not exists",
+              -4: "skip task",
+              - 404: "param error"}
+def download_with_request(url, path):
+    res = requests.get(url)
+    if res.status_code == '200' or res.status_code == 200:
+        with open(path, "wb") as f:
+            f.write(res.content)
+    else:
+        print('request failed')
+    return path
+def download_video(url, save_path:str=None, save_dir:str=None, basename:str=None, filename:str=None, format:str=None, data_type: str="wget", **kwargs) -> Tuple[int, str]:
+    if save_path is None:
+        if basename is None:
+            basename =  f"{filename}.{format}"
+        save_path = os.path.join(save_dir, basename)
+    if save_dir is None:
+        save_dir = os.path.dirname(save_path)
+    if basename is None:
+        basename = os.path.basename(save_path)
+    if filename is None:
+        filename, format = os.path.splitext(basename)
+    os.makedirs(save_dir, exist_ok=True)
+    if os.path.exists(save_path):
+        return (-4, save_path)
+    try:
+        if data_type == "requests":
+             save_path = download_with_request(url=url, path=save_path)
+        elif data_type == "wget":
+            save_path = wget.download(url=url, out=save_path)
+        elif data_type == "youtube":
+            save_path = download_youtube(url, format=format, save_dir=save_dir, filename=basename)
+        elif data_type == "flickr":
+            save_path = download_flickr(url, save_path)
+        elif data_type == "ffmpeg":
+            code = ffmpeg_load(url=url, save_path=save_path)
+        else:
+            raise ValueError(f"data_type shoulbe one of [wget, youtube, flickr, ffmpeg], but given {data_type}")
+    except Exception as e:
+        logger.error("failed download file {} to {} failed!".format(url, save_path))
+        logger.exception(e)
+        return (-1, None)
+    return (0, save_path)

MuseV/MMCM/mmcm/data/crawl/error.py ADDED Viewed

	@@ -0,0 +1,20 @@

+class SubprocessError(Exception):
+    """
+    Exception object that contains information about an error that occurred
+    when running a command line command with a subprocess.
+    """
+    def __init__(self, cmd, return_code, stdout, stderr, *args):
+        msg = 'Got non-zero exit code ({1}) from command "{0}": {2}'
+        if stderr.strip():
+            err_msg = stderr
+        else:
+            err_msg = stdout
+        msg = msg.format(cmd[0], return_code, err_msg)
+        self.cmd = cmd
+        self.cmd_return_code = return_code
+        self.cmd_stdout = stdout
+        self.cmd_stderr = stderr
+        super(SubprocessError, self).__init__(msg, *args)

MuseV/MMCM/mmcm/data/crawl/ffmpeg.py ADDED Viewed

	@@ -0,0 +1,39 @@

+import subprocess
+from .error import SubprocessError
+class FfmpegInvalidURLError(Exception):
+    """
+    Exception raised when a 4XX or 5XX error is returned when making a request
+    """
+    def __init__(self, url, error, *args):
+        self.url = url
+        self.error = error
+        msg = 'Got error when making request to "{}": {}'.format(url, error)
+        super(FfmpegInvalidURLError, self).__init__(msg, *args)
+def ffmpeg_load(url: str, save_path: str) -> str:
+    def run(cmd):
+        proc = subprocess.Popen(
+            cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+        stdout, stderr = proc.communicate()
+        return_code = proc.returncode
+        if return_code != 0:
+            raise SubprocessError(
+                cmd, return_code, stdout.decode(), stderr.decode())
+        return return_code
+    command = ['ffmpeg', '-n', '-i', url, '-t', '10', '-f', 'mp4',
+               '-r', '30', '-vcodec', 'h264', save_path, '-loglevel', 'error']
+    code = run(command)
+    return code

MuseV/MMCM/mmcm/data/crawl/flicker.py ADDED Viewed

	@@ -0,0 +1,22 @@

+import os
+from .ffmpeg import ffmpeg_load
+def extract_flickr_id(url):
+    return url.strip('/').split('/')[-4]
+def download_flickr(url: str, save_path: str) -> str:
+    code = -1
+    code = ffmpeg_load(url=url,
+                       save_path=save_path)
+    if code == 0:
+        return (code, save_path)
+    # only retry when failed!
+    flickr_id = extract_flickr_id(url)
+    url = 'https://www.flickr.com/video_download.gne?id={}'.format(
+        flickr_id)
+    code = ffmpeg_load(url=url,
+                       save_path=save_path)
+    return save_path

MuseV/MMCM/mmcm/data/crawl/youtube.py ADDED Viewed

	@@ -0,0 +1,13 @@

+import os
+from pytube import YouTube
+def download_youtube(url, format, save_dir, filename):
+    youtube = YouTube(url)
+    streams = youtube.streams.filter(progressive=True,
+                                     file_extension=format)
+    save_path = streams.get_highest_resolution().download(output_path=save_dir,
+                                              filename=filename)
+    return save_path

MuseV/MMCM/mmcm/data/emb/__init__.py ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ from .emb import *
2	+ from .h5py_emb import H5pyMediaMapEmb, H5pyMediaMapEmbProxy

MuseV/MMCM/mmcm/data/emb/emb.py ADDED Viewed

	@@ -0,0 +1,104 @@

+"""用于将 mediamap中的emb存储独立出去，仍处于开发中
+"""
+import logging
+import numpy as np
+logger = logging.getLogger(__name__)  # pylint: disable=invalid-name
+__all__ = ["MediaMapEmb"]
+class MediaMapEmb(object):
+    def __init__(self, path: str) -> None:
+        """
+        OfflineEmb = {
+            "overall_algo": Emb,  # 整个文件的Emb
+            # 整个文件的多维度 Emb
+            "theme": np.array,  # 主题，
+            "emotion_algo":  np.array,  # 情绪，
+            "semantic_algo":  np.array,  # 语义
+            "clips_overall_algo":  np.array, n_clip x clip_emb
+            "clips_emotion_algo":  np.array, n_clip x clip_emb
+            "clips_semantic_algo":  np.array, n_clip x clip_emb
+            "clips_theme_algo":  np.array, n_clip x clip_emb
+            "scenes_overall_algo":  np.array, n_scenes x scene_emb
+            "scenes_emotion_algo":  np.array, n_scenes x scene_emb
+            "scenes_semantic_algo":  np.array, n_scenes x scene_emb
+            "scenes_theme_algo": E np.arraymb, n_scenes x scene_emb
+            # 片段可以是转场切分、MusicStage等, clips目前属于转场切分片段
+            # 若后续需要新增段落分割，可以和clips同级新增 stage字段。
+            "frames_overall_algo":  np.array, n_frames x frame_emb
+            "frames_emotion_algo":  np.array, n_frames x frame_emb
+            "frames_semantic_algo":  np.array, n_frames x frame_emb
+            "frames_theme_algo":  np.array, n_frames x frame_emb
+            "frames_objs": {
+                "frame_id": {  #
+                    "overall_algo":  np.array, n_objs x obj_emb
+                    "emotion_algo":  np.array, n_objs x obj_emb
+                    "semantic_algo":  np.array, n_objs x obj_emb
+                    "theme_algo":  np.array, n_objs x obj_emb
+                }
+            }
+            "roles_algo": {
+                "roleid": np.array, n x obj_emb
+            }
+        }
+        Args:
+            path (str): hdf5 存储路径
+        """
+        self.path = path
+    def get_value(self, key, idx=None):
+        raise NotImplementedError
+    def __getitem__(self, key):
+        return self.get_value(key)
+    def get_media(self, factor, algo):
+        return self.get_value(f"{factor}_{algo}")
+    def get_clips(self, factor, algo, idx=None):
+        return self.get_value(f"clips_{factor}_{algo}", idx=idx)
+    def get_frames(self, factor, algo, idx=None):
+        return self.get_value(f"frames_{factor}_{algo}", idx=idx)
+    def get_frame_objs(self, frame_idx, factor, algo, idx=None):
+        return self.get_value(["frames_objs", frame_idx, f"{factor}_{algo}"], idx=idx)
+    def set_value(self, key, value, idx=None):
+        raise NotImplementedError
+    def set_media(self, factor, value, algo):
+        self.set_value([f"{factor}_{algo}"], value)
+    def set_clips(self, factor, value, algo, idx=None):
+        self.set_value([f"clips_{factor}_{algo}"], value, idx=idx)
+    def set_frames(self, factor, value, algo, idx=None):
+        self.set_value([f"frames_{factor}_{algo}"], value)
+    def set_frame_objs(self, frame_idx, factor, value, algo, idx=None):
+        return self.set_value(
+            ["frames_objs", frame_idx, f"{factor}_{algo}"], value, idx=idx
+        )
+    def set_roles(self, algo, value, idx=None):
+        return self.set_value(f"roles_{algo}", value, idx=idx)
+    def get_roles(self, algo, idx=None):
+        return self.get_value(f"roles_{algo}", idx=idx)
+    def __setitem__(self, key, value):
+        self.set_value(self, key, value)
+class MediaMapEmbProxy(MediaMapEmb):
+    pass

MuseV/MMCM/mmcm/data/emb/h5py_emb.py ADDED Viewed

	@@ -0,0 +1,119 @@

+from typing import Union, List
+import logging
+import h5py
+import numpy as np
+from .emb import MediaMapEmb
+logger = logging.getLogger(__name__)  # pylint: disable=invalid-name
+__all__ = ["H5pyMediaMapEmb", "save_value_with_h5py"]
+def save_value_with_h5py(
+    path: str,
+    value: Union[np.ndarray, None],
+    key: str,
+    idx: Union[int, List[int]] = None,
+    dtype=None,
+    shape=None,
+    overwrite: bool = False,
+):
+    with h5py.File(path, "a") as f:
+        if dtype is None:
+            dtype = value.dtype
+        if shape is None:
+            shape = value.shape
+        del_key = False
+        if key in f:
+            if overwrite:
+                del_key = True
+            if f[key].dtype != h5py.special_dtype(vlen=str):
+                if f[key].shape != value.shape:
+                    del_key = True
+            if del_key:
+                del f[key]
+        if key not in f:
+            f.create_dataset(key, shape=shape, dtype=dtype)
+        if idx is None:
+            f[key][...] = value
+        else:
+            f[key][idx] = value
+class H5pyMediaMapEmb(MediaMapEmb):
+    def __init__(self, path: str) -> None:
+        """
+        OfflineEmb = {
+            "overall_algo": Emb,  # 整个文件的Emb
+            # 整个文件的多维度 Emb
+            "theme": np.array,  # 主题，
+            "emotion_algo":  np.array,  # 情绪，
+            "semantic_algo":  np.array,  # 语义
+            "clips_overall_algo":  np.array, n_clip x clip_emb
+            "clips_emotion_algo":  np.array, n_clip x clip_emb
+            "clips_semantic_algo":  np.array, n_clip x clip_emb
+            "clips_theme_algo":  np.array, n_clip x clip_emb
+            "scenes_overall_algo":  np.array, n_scenes x scene_emb
+            "scenes_emotion_algo":  np.array, n_scenes x scene_emb
+            "scenes_semantic_algo":  np.array, n_scenes x scene_emb
+            "scenes_theme_algo": E np.arraymb, n_scenes x scene_emb
+            # 片段可以是转场切分、MusicStage等, clips目前属于转场切分片段
+            # 若后续需要新增段落分割，可以和clips同级新增 stage字段。
+            "frames_overall_algo":  np.array, n_frames x frame_emb
+            "frames_emotion_algo":  np.array, n_frames x frame_emb
+            "frames_semantic_algo":  np.array, n_frames x frame_emb
+            "frames_theme_algo":  np.array, n_frames x frame_emb
+            "frames_objs_algo": {
+                "frame_id_algo": {  #
+                    "overall_algo":  np.array, n_objs x obj_emb
+                    "emotion_algo":  np.array, n_objs x obj_emb
+                    "semantic_algo":  np.array, n_objs x obj_emb
+                    "theme_algo":  np.array, n_objs x obj_emb
+                }
+            }
+            "roles_algo": {
+                "roleid": np.array, n x obj_emb
+            }
+        }
+        Args:
+            path (str): hdf5 存储路径
+        """
+        super().__init__(path)
+        # 待优化支持 with open 的方式来读写
+        self.f = h5py.File(path, "a")
+    def _keys_index(self, key):
+        if not isinstance(key, list):
+            key = [key]
+        key = "/".join([str(x) for x in key if x is not None])
+        return key
+    def get_value(self, key, idx=None):
+        new_key = self._keys_index(key)
+        if idx is None:
+            data = np.array(self.f[new_key])
+        else:
+            data = np.array(self.f[new_key][idx])
+        return data
+    def set_value(self, key, value, idx=None):
+        new_key = self._keys_index(key)
+        if new_key not in self.f:
+            self.f.create_dataset(new_key, shape=value.shape, dtype=value.dtype)
+        if idx is None:
+            self.f[new_key][...] = value
+        else:
+            self.f[new_key][idx] = value
+    def close(self):
+        self.f.close()
+class H5pyMediaMapEmbProxy(H5pyMediaMapEmb):
+    pass

MuseV/MMCM/mmcm/data/emb/json_emb.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/emb/numpy_emb.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/extract_feature/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/data/extract_feature/base_extract_feature.py ADDED Viewed

	@@ -0,0 +1,28 @@

+from typing import List, Union, Any
+import torch
+from torch import nn
+import numpy as np
+import h5py
+class BaseFeatureExtractor(nn.Module):
+    def __init__(self, device: str = "cpu", dtype=torch.float32, name: str = None):
+        super().__init__()
+        self.device = device
+        self.dtype = dtype
+        self.name = name
+    def extract(
+        self, data: Any, return_type: Union[str, str] = "numpy"
+    ) -> Union[np.ndarray, torch.tensor]:
+        raise NotADirectoryError
+    def __call__(self, *args: Any, **kwds: Any) -> Any:
+        return self.extract(*args, **kwds)
+    def save_with_h5py(self, f: Union[h5py.File, str], *args, **kwds):
+        raise NotImplementedError
+    def forward(self, *args: Any, **kwds: Any) -> Any:
+        return self.extract(*args, **kwds)

MuseV/MMCM/mmcm/data/general/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ from .items import Items

MuseV/MMCM/mmcm/data/general/items.py ADDED Viewed

	@@ -0,0 +1,69 @@

+from collections import UserList
+from collections.abc import Iterable
+from typing import Iterator, Any, List
+from ...utils.util import convert_class_attr_to_dict
+__all__ = ["Item", "Items"]
+class Item(object):
+    def __init__(self) -> None:
+        pass
+    def to_dct(self, target_keys: List[str] = None, ignored_keys: List[str] = None):
+        base_ignored_keys = [
+            "kwargs",
+        ]
+        if isinstance(ignored_keys, list):
+            base_ignored_keys.extend(ignored_keys)
+        elif isinstance(ignored_keys, str):
+            base_ignored_keys.append(ignored_keys)
+        else:
+            pass
+        return convert_class_attr_to_dict(
+            self, target_keys=target_keys, ignored_keys=base_ignored_keys
+        )
+    def preprocess(self):
+        pass
+class Items(UserList):
+    def __init__(
+        self,
+        data: Any = None,
+    ):
+        if data is None:
+            data = list()
+        if not isinstance(data, list):
+            data = [data]
+        super().__init__(data)
+    def __len__(self):
+        return len(self.data)
+    def __getitem__(self, i):
+        return self.data[i]
+    def __delitem__(self, i):
+        del self.data[i]
+    def __setitem__(self, i, v):
+        self.data[i] = v
+    def insert(self, i, v):
+        self.data.insert(i, v)
+    def __str__(self):
+        return str(self.data)
+    def to_dct(self, target_keys: List[str] = None, ignored_keys: List[str] = None):
+        items = [item.to_dct(target_keys, ignored_keys) for item in self.data]
+        return items
+    def __iter__(self) -> Iterator:
+        return iter(self.data)
+    def preprocess(self):
+        pass

MuseV/MMCM/mmcm/data/media_map/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ from .media_map import MetaInfo, MediaMap, MetaInfoList

MuseV/MMCM/mmcm/data/media_map/media_map.py ADDED Viewed

	@@ -0,0 +1,393 @@

+from __future__ import annotations
+import bisect
+import logging
+from copy import deepcopy
+from functools import partial
+from typing import Any, Callable, Iterable, List, Union, Tuple, Dict
+import numpy as np
+from ..clip.clip_process import get_subseq_by_time
+from ..clip.clip_stat import stat_clipseq_duration
+from ..clip import Clip, ClipSeq, ClipIds, MatchedClipIds, MatchedClipIdsSeq
+from .media_map_process import get_sub_mediamap_by_time
+from ..emb import MediaMapEmb, H5pyMediaMapEmb
+from ..general.items import Item, Items
+from ...utils.data_util import pick_subdct
+from ...utils.util import convert_class_attr_to_dict, load_dct_from_file
+logger = logging.getLogger(__name__)  # pylint: disable=invalid-name
+__all__ = ["MetaInfo", "MetaInfoList", "MediaMap", "MediaMapSeq"]
+class MetaInfo(Item):
+    """歌曲、视频等媒体文件级别的元信息"""
+    def __init__(
+        self,
+        mediaid=None,
+        media_name=None,
+        media_duration=None,
+        signature=None,
+        media_path: str = None,
+        media_map_path: str = None,
+        start: float = None,
+        end: float = None,
+        ext=None,
+        **kwargs,
+    ):
+        super(MetaInfo).__init__()
+        self.mediaid = mediaid
+        self.media_name = media_name
+        self.media_duration = media_duration
+        self.signature = signature
+        self.media_path = media_path
+        self.media_map_path = media_map_path
+        self.start = start
+        self.end = end
+        self.ext = ext
+        self.__dict__.update(**kwargs)
+        self.preprocess()
+    def preprocess(self):
+        self.set_start_end()
+    def set_start_end(self):
+        if self.start is None:
+            self.start = 0
+        elif self.start >= 0 and self.start <= 1:
+            self.start = self.start * self.media_duration
+        if self.end is None:
+            self.end = self.media_duration
+        elif self.end >= 0 and self.end <= 1:
+            self.end = self.end * self.media_duration
+class MetaInfoList(Items):
+    """媒体元数据列表，主要用于多歌曲、多视频剪辑时存储原单一媒体文件的元信息"""
+    def __init__(self, items: Union[MetaInfo, List[MetaInfo]] = None):
+        """
+        Args:
+            meta_info_list (list, optional): MetaInfo 列表. Defaults to None.
+        """
+        if items is None:
+            items = []
+        else:
+            items = items if isinstance(items, list) else [items]
+        super().__init__(items)
+        self.meta_info_list = self.items
+        if len(self.items) > 1:
+            self.reset()
+    def __len__(self):
+        return len(self.meta_info_list)
+    def __getitem__(self, i) -> MetaInfo:
+        return self.meta_info_list[i]
+    @property
+    def groupnum(self) -> int:
+        return len(self.meta_info_list)
+class MediaMap(object):
+    """媒体信息基类，也可以理解为音乐谱面、视觉谱面、音游谱面基类。主要有 MetaInfo、MetaInfoList、ClipSeq 属性。
+    不同的媒体信息的 属性 类会有不同，所以在类变量里做定义。如有变化，可以定义自己的属性类。
+    """
+    def __init__(
+        self,
+        meta_info: MetaInfo = None,
+        clipseq: ClipSeq = None,
+        stageseq: ClipSeq = None,
+        frameseq: ClipSeq = None,
+        emb: H5pyMediaMapEmb = None,
+        **kwargs,
+    ):
+        """用于存储media的相关信息，media_info是json或直接字典
+        Args:
+            meta_info (MetaInfo): 当sub_meta_info不为None时, meta_info由sub_meta_info整合而成
+            sub_meta_info (None or [MetaInfo]): 当多个MediaInfo拼在一起时,用于保留子MediaInfo的信息
+            clipseq (ClipSeq): # 按照clipidx排序;
+            stageseq (ClipSeq): # 比 clipseq 更高纬度的片段划分，例如clips是镜头分割，stages是scenes分割；clips是关键点分割，stages是结构分割；
+            frameseq (ClipSeq): # 比 clipseq 更低纬度的片段划分
+            kwargs (dict, optional): 所有相关信息都会作为 meta_info 的补充，赋值到 meta_info 中
+        """
+        self.meta_info = meta_info
+        self.clipseq = clipseq
+        self.frameseq = frameseq
+        self.stageseq = stageseq
+        self.emb = emb
+        self.meta_info.__dict__.update(**kwargs)
+        self.preprocess()
+    def preprocess(
+        self,
+    ):
+        if (self.meta_info.start != 0 and self.meta_info.start is not None) or (
+            self.meta_info.end is not None and self.meta_info.end == 1
+        ):
+            self.drop_head_and_tail()
+        self.meta_info.preprocess()
+        if self.clipseq is not None:
+            self.clipseq.preprocess()
+        if self.frameseq is not None:
+            self.frameseq.preprocess()
+        if self.stageseq is not None:
+            self.stageseq.preprocess()
+        self.clip_start_idx = self.clipseq[0].clipid
+        self.clip_end_idx = self.clipseq[-1].clipid
+    def drop_head_and_tail(self) -> MediaMap:
+        self.clipseq = get_subseq_by_time(
+            self.clipseq,
+            start=self.meta_info.start,
+            end=self.meta_info.end,
+            duration=self.meta_info.media_duration,
+        )
+        if self.stageseq is not None:
+            self.stageseq = get_subseq_by_time(
+                self.clipseq,
+                start=self.meta_info.start,
+                end=self.meta_info.end,
+                duration=self.meta_info.media_duration,
+            )
+    def set_clip_value(self, k, v):
+        """为clipseq中的每个clip赋值，
+        Args:
+            k (str): Clip中字段名
+            v (any): Clip中字段值
+        """
+        self.clipseq.set_clip_value(k, v)
+    def spread_metainfo_2_clip(
+        self, target_keys: List = None, ignored_keys: List = None
+    ) -> None:
+        """将metainfo中的信息赋值到clip中，便于clip后面做相关处理。
+        Args:
+            target_keys ([str]): 待赋值的目标字段
+        """
+        dst = pick_subdct(
+            self.meta_info.__dict__, target_keys=target_keys, ignored_keys=ignored_keys
+        )
+        for k, v in dst.items():
+            self.set_clip_value(k, v)
+    def spread_parameters(self, target_keys: list, ignored_keys) -> None:
+        """元数据广播，将 media_info 的元数据广播到 clip 中，以及调用 clip 自己的参数传播。"""
+        self.spread_metainfo_2_clip(target_keys=target_keys, ignored_keys=ignored_keys)
+        for clip in self.clipseq:
+            clip.spread_parameters()
+    def stat(
+        self,
+    ):
+        """统计 media_info 相关信息，便于了解，目前统计内容有
+        1. 片段长度
+        """
+        self.stat_clipseq_duration()
+    def stat_clipseq_duration(
+        self,
+    ):
+        hist, bin_edges = stat_clipseq_duration(self.clipseq)
+        print(self.media_name, "bin_edges", bin_edges)
+        print(self.media_name, "hist", hist)
+    def to_dct(self, target_keys: list = None, ignored_keys: list = None):
+        raise NotImplementedError
+    @property
+    def duration(
+        self,
+    ):
+        return self.clipseq.duration
+    @property
+    def mediaid(
+        self,
+    ):
+        return self.meta_info.mediaid
+    @property
+    def media_name(
+        self,
+    ):
+        return self.meta_info.media_name
+    @property
+    def duration_seq_emb(self):
+        return self.clipseq.duration_seq_emb
+    @property
+    def timestamp_seq_emb(self):
+        return self.clipseq.timestamp_seq_emb
+    @property
+    def rela_timestamp_seq_emb(self):
+        return self.clipseq.rela_timestamp_seq_emb
+    def get_emb(self, key, idx=None):
+        # TODO: 待修改为更通用的形式
+        if idx is None:
+            idx = range(self.clip_start_idx, self.clip_end_idx + 1)
+        elif isinstance(idx, int):
+            idx += self.clip_start_idx
+        elif isinstance(idx, Iterable):
+            idx = [x + self.clip_start_idx for x in idx]
+        else:
+            raise ValueError(
+                f"idx only support None, int, Iterable, but given {idx},type is {type(idx)}"
+            )
+        return self.emb.get_value(key, idx=idx)
+    def get_meta_info_attr(self, key: str) -> Any:
+        return getattr(self.meta_info, key)
+    @classmethod
+    def from_json_path(
+        cls, path: Dict, emb_path: str, media_path: str = None, **kwargs
+    ) -> MediaMap:
+        media_map = load_dct_from_file(path)
+        emb = H5pyMediaMapEmb(emb_path)
+        return cls.from_data(media_map, emb=emb, media_path=media_path, **kwargs)
+class MediaMapSeq(Items):
+    def __init__(self, maps: List[MediaMap]) -> None:
+        super().__init__(maps)
+        self.maps = self.data
+        self.preprocess()
+        self.each_map_clipseq_num = [len(m.clipseq) for m in self.maps]
+        self.each_map_clipseq_num_cumsum = np.cumsum([0] + self.each_map_clipseq_num)
+    @property
+    def clipseq(self):
+        clipseq = []
+        for m in self.maps:
+            clipseq.extend(m.clipseq.data)
+        return type(self.maps[0].clipseq)(clipseq)
+    @property
+    def stagesseq(self):
+        stagesseq = []
+        for m in self.maps:
+            stagesseq.extend(m.stagesseq.data)
+        return type(self.maps[0].stagesseq)(stagesseq)
+    @property
+    def frameseq(self):
+        frameseq = []
+        for m in self.maps:
+            frameseq.extend(m.frameseq.data)
+        return type(self.maps[0].frameseq)(frameseq)
+    def preprocess(self):
+        for m in self.maps:
+            m.preprocess()
+    def _combine_str(
+        self,
+        attrs: List[str],
+        sep: str = "|",
+        single_maxlen: int = 10,
+        total_max_length: int = 60,
+    ) -> str:
+        return sep.join([str(attr)[:single_maxlen] for attr in attrs])[
+            :total_max_length
+        ]
+    def get_meta_info_attr(self, key: str, func: Callable) -> Any:
+        attrs = [m.get_meta_info_attr(key) for m in self.maps]
+        return func(attrs)
+    @property
+    def mediaid(self) -> str:
+        return self.get_meta_info_attr(key="mediaid", func=self._combine_str)
+    @property
+    def media_name(self) -> str:
+        return self.get_meta_info_attr(key="media_name", func=self._combine_str)
+    @property
+    def duration(self) -> float:
+        return sum([m.duration for m in self.maps])
+    @property
+    def media_duration(self) -> float:
+        return self.get_meta_info_attr(key="media_duration", func=sum)
+    @classmethod
+    def from_json_paths(
+        cls,
+        media_map_class: MediaMap,
+        media_paths: str,
+        media_map_paths: str,
+        emb_paths: str,
+        **kwargs,
+    ) -> MediaMapSeq:
+        map_seq = [
+            media_map_class.from_json_path(
+                path=media_map_paths[i],
+                emb_path=emb_paths[i],
+                media_path=media_paths[i],
+                **kwargs,
+            )
+            for i in range(len(media_map_paths))
+        ]
+        return cls(map_seq)
+    # TODO: implement mapseq stat func
+    def stat(self):
+        for m in self.maps:
+            m.stat()
+    def _combine_embs(self, embs):
+        return np.concatenate(embs, axis=0)
+    @property
+    def duration_seq_emb(self):
+        embs = [m.duration_seq_emb for m in self.maps]
+        return self._combine_embs(embs)
+    @property
+    def timestamp_seq_emb(self):
+        embs = [m.timestamp_seq_emb for m in self.maps]
+        return self._combine_embs(embs)
+    @property
+    def rela_timestamp_seq_emb(self):
+        embs = [m.rela_timestamp_seq_emb for m in self.maps]
+        return self._combine_embs(embs)
+    def clip_idx_2_map_idx(self, idx):
+        target_map_idx = bisect.bisect_right(self.each_map_clipseq_num_cumsum, idx)
+        target_map_idx = min(max(0, target_map_idx - 1), len(self.maps) - 1)
+        target_map_clip_idx = idx - self.each_map_clipseq_num_cumsum[target_map_idx]
+        return target_map_idx, target_map_clip_idx
+    def get_emb(self, key: str, idx: Union[None, int, List[int]] = None) -> np.array:
+        if idx is None:
+            embs = [m.get_emb(key, idx=idx) for m in self.maps]
+        else:
+            if not isinstance(idx, list):
+                idx = [idx]
+            embs = []
+            for c_idx in idx:
+                target_map_idx, target_map_clip_idx = self.clip_idx_2_map_idx(c_idx)
+                embs.append(
+                    self.maps[target_map_idx].get_emb(key, int(target_map_clip_idx))
+                )
+        if len(embs) == 1:
+            return embs[0]
+        else:
+            return self._combine_embs(embs)

MuseV/MMCM/mmcm/data/media_map/media_map_process.py ADDED Viewed

	@@ -0,0 +1,72 @@

+from __future__ import annotations
+from typing import List, Union, TYPE_CHECKING
+from ..clip.clip_process import (
+    get_subseq_by_time,
+    find_time_by_stage,
+)
+if TYPE_CHECKING:
+    from ..media_map.media_map import MediaMap
+    from ..clip import Clip, ClipSeq
+__all__ =[
+    "get_sub_mediamap_by_clip_idx",
+    "get_sub_mediamap_by_stage",
+    "get_sub_mediamap_by_time",
+]
+def get_sub_mediamap_by_time(media_map:MediaMap, start: int=0, end:int=1, eps=1e-2) -> MediaMap:
+    """获取子片段序列，同时更新media_map中的相关信息
+    Args:
+        media_map (MediaInfo): _description_
+        start (float): 开始时间
+        end (float): 结束时间
+    Returns:
+        _type_: _description_
+    """
+    if start < 1:
+        start = media_map.duration * start
+    if end is None:
+        end = media_map.meta_info.media_duration
+    elif end <= 1:
+        end = media_map.duration * end
+    media_map.meta_info.start = start
+    media_map.meta_info.end = end
+    media_map.clipseq = get_subseq_by_time(
+        media_map.clipseq,
+        start=start,
+        end=end,
+    )
+    if media_map.stageseq is not None:
+        media_map.stageseq = get_subseq_by_time(media_map.stageseq, start=start, end=end)
+    return media_map
+def get_sub_mediamap_by_clip_idx(media_map: MediaMap, start: int=None, end: int=None) -> MediaMap:
+    """不仅获取子片段序列，还要更新media_map中的相关信息
+    Args:
+        media_map (_type_): _description_
+    """
+    if start is None:
+        start = 0
+    if end is None:
+        end = -1
+    start = media_map.clipseq[start].time_start
+    end = media_map.clipseq[end].time_end
+    media_map = get_sub_mediamap_by_time(media_map=media_map, start=start, end=end)
+    return media_map
+def get_sub_mediamap_by_stage(media_map: MediaMap, stages: Union[str, List[str]]) -> MediaMap:
+    if isinstance(stages, List):
+        stages = [stages]
+    start, _ = find_time_by_stage(media_map.stageseq, stages[0])
+    _, end = find_time_by_stage(media_map.stageseq, stages[-1])
+    media_map = get_sub_mediamap_by_time(media_map=media_map, start=start, end=end)
+    return media_map

MuseV/MMCM/mmcm/music/__init__.py ADDED Viewed

	@@ -0,0 +1,6 @@

+from .music_map.music_map import MusicMap, MusicMapSeq
+from .music_map.music_clip import MusicClip, MusicClipSeq
+from .music_map.meta_info import MusicMetaInfo
+from .music_map.load_music_map import load_music_map
+from .utils.path_util import get_audio_path_dct

MuseV/MMCM/mmcm/music/music_map/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/music/music_map/beat_map.py ADDED Viewed

	@@ -0,0 +1,82 @@

+import numpy as np
+from librosa.core.audio import get_duration
+from ...data.clip.clip_process import insert_endclip, insert_startclip
+from .clip_process import filter_clipseq_target_point
+from .music_clip import MusicClip, MusicClipSeq
+def beatnet2TMEType(beat: np.array, duration: float) -> MusicClipSeq:
+    """conver beatnet beat to tme beat type
+    Args:
+        beat (np.array): Nx2,
+            1st column is time,
+            2rd is type,
+                0, end point
+                1, strong beat
+                2,3,4 weak beat
+                -1 lyric
+        duration (float): audio time length
+    Returns:
+        MusicClipSeq:
+    """
+    n = len(beat)
+    beat = np.insert(beat, 0, 0, axis=0)
+    beat = np.insert(beat, n + 1, [duration, 0], axis=0)
+    clips = []
+    for i in range(n + 1):
+        beat_type = int(beat[i + 1, 1])
+        clip = MusicClip(
+            time_start=beat[i, 0],  # 开始时间
+            duration=round(beat[i + 1, 0] - beat[i, 0], 3),  # 片段持续时间
+            clipid=i,  # 片段序号，
+            timepoint_type=beat_type,
+        )
+        clips.append(clip)
+    clipseq = MusicClipSeq(clips=clips)
+    return clipseq
+def generate_beatseq_with_beatnet(audio_path: str) -> np.array:
+    """使用beatnet生成beat序列
+    Args:
+        audio_path (str):
+    Returns:
+        np.array: beat序列 Nx2,
+            1st column is time,
+            2rd is type,
+                0, end point
+                1, strong beat
+                2,3,4 weak beat
+    """
+    from BeatNet.BeatNet import BeatNet
+    estimator = BeatNet(1, mode="offline", inference_model="DBN", plot=[], thread=False)
+    output = estimator.process(audio_path=audio_path)
+    return output
+def generate_music_map_with_beatnet(
+    audio_path: str, target: list = [0, 1]
+) -> MusicClipSeq:
+    """使用beatnet生成beat MusicClipseq
+    Args:
+        audio_path (str):
+        target (list, optional): 只保留相应的拍点. Defaults to [0, 1].
+    Returns:
+        MusicClipSeq: 返回的beat序列
+        beat: np.array, 原始的beat检测结果
+    """
+    output = generate_beatseq_with_beatnet(audio_path)
+    duration = get_duration(filename=audio_path)
+    clipseq = beatnet2TMEType(output, duration)
+    clipseq = insert_startclip(clipseq)
+    clipseq = insert_endclip(clipseq, duration)
+    clipseq = filter_clipseq_target_point(clipseq, target=target)
+    return clipseq, output

MuseV/MMCM/mmcm/music/music_map/clip_process.py ADDED Viewed

	@@ -0,0 +1,196 @@

+from __future__ import annotations
+from typing import TYPE_CHECKING, Dict, List
+import numpy as np
+from ...data.clip.clip_process import find_idx_by_time, reset_clipseq_id
+from ...data.clip.clip_fusion import fuse_clips
+from ...utils.util import merge_list_continuous_same_element
+if TYPE_CHECKING:
+    from .music_clip import MusicClip, MusicClipSeq
+    from .music_map import MusicMap, MusicMapSeq
+# TODO: 待和clip操作做整合
+def music_clip_is_short(clip: MusicClip, th: float = 3) -> bool:
+    """判断音乐片段是否过短
+    Args:
+        clip (MusicClip): 待判断的音乐片段
+        th (float, optional): 短篇的参数. Defaults to 3.
+    Returns:
+        bool: 是或不是 短片段
+    """
+    if clip.duration < th:
+        return False
+    else:
+        return True
+def music_clip_timepoint_is_target(clip: MusicClip, target: list = [-1, 1, 0]) -> bool:
+    """音乐片段的关键点类型是否是目标关键点
+    关键点类型暂时参考：VideoMashup/videomashup/data_structure/music_data_structure.py
+    Args:
+        clip (MusicClip): 待判断的音乐片段
+        target (list, optional): 目标关键点类别. Defaults to [-1, 1, 0].
+    Returns:
+        bool: 是还是不是
+    """
+    timepoint = clip.timepoint_type
+    if isinstance(timepoint, int):
+        timepoint = {timepoint}
+    else:
+        timepoint = {int(x) for x in timepoint.split("_")}
+    if timepoint & set(target):
+        return True
+    else:
+        return False
+def filter_clipseq_target_point(
+    clipseq: MusicClipSeq, target: list = [-1, 1, 0]
+) -> MusicClipSeq:
+    """删除目标关键点之外的点，对相应的片段做融合
+    Args:
+        clipseq (MusicClipSeq): 待处理的音乐片段序列
+        target (list, optional): 保留的目标关键点. Defaults to [-1, 1, 0].
+    Returns:
+        MusicClipSeq: 处理后的音乐片段序列
+    """
+    n_clipseq = len(clipseq)
+    if n_clipseq == 1:
+        return clipseq
+    newclipseq = []
+    start_clip = clipseq[0]
+    if music_clip_timepoint_is_target(start_clip, target=target):
+        has_start_clip = True
+    else:
+        has_start_clip = False
+    i = 1
+    while i <= n_clipseq - 1:
+        clip = clipseq[i]
+        start_clip_is_target = music_clip_timepoint_is_target(start_clip, target=target)
+        next_clip_is_target = music_clip_timepoint_is_target(clip, target=target)
+        # logger.debug("filter_clipseq_target_point: i={},start={}, clip={}".format(i, start_clip["timepoint_type"], clip["timepoint_type"]))
+        # logger.debug("start_clip_is_target: {}, next_clip_is_target {}".format(start_clip_is_target, next_clip_is_target))
+        if not has_start_clip:
+            start_clip = clip
+            has_start_clip = next_clip_is_target
+        else:
+            if start_clip_is_target:
+                has_start_clip = True
+                if next_clip_is_target:
+                    newclipseq.append(start_clip)
+                    start_clip = clip
+                    if i == n_clipseq - 1:
+                        newclipseq.append(clip)
+                else:
+                    start_clip = fuse_clips(start_clip, clip)
+                    if i == n_clipseq - 1:
+                        newclipseq.append(start_clip)
+                    # logger.debug("filter_clipseq_target_point: fuse {}, {}".format(i, clip["timepoint_type"]))
+            else:
+                start_clip = clip
+        i += 1
+    newclipseq = reset_clipseq_id(newclipseq)
+    return newclipseq
+def merge_musicclip_into_clipseq(
+    clip: MusicClipSeq, clipseq: MusicClip, th: float = 1
+) -> MusicClipSeq:
+    """给clipseq插入一个新的音乐片段，会根据插入后片段是否过短来判断。
+    Args:
+        clip (MusicClipSeq): 要插入的音乐片段
+        clipseq (MusicClip): 待插入的音乐片段序列
+        th (float, optional): 插入后如果受影响的片段长度过短，则放弃插入. Defaults to 1.
+    Returns:
+        MusicClipSeq: _description_
+    """
+    n_clipseq = len(clipseq)
+    clip_time = clip.time_start
+    idx = find_idx_by_time(clipseq, clip_time)
+    last_clip_time_start = clipseq[idx].time_start
+    next_clip_time_start = clipseq[idx].time_start + clipseq[idx].duration
+    last_clip_time_delta = clip_time - last_clip_time_start
+    clip_duration = next_clip_time_start - clip_time
+    # TODO: 副歌片段改变th参数来提升音符密度，暂不使用，等待音游谱面
+    # TODO: 待抽离独立的业务逻辑为单独的函数
+    # 只针对副歌片段插入关键点
+    if clipseq[idx].text is None or (
+        clipseq[idx].text is not None
+        and clipseq[idx].stage is not None
+        and "C" in clipseq[idx].stage
+    ):
+        if (last_clip_time_delta > th) and (clip_duration > th):
+            clip.duration = clip_duration
+            clipseq[idx].duration = last_clip_time_delta
+            clipseq.insert(idx + 1, clip)
+        clipseq = reset_clipseq_id(clipseq)
+    return clipseq
+def merge_music_clipseq(clipseq1: MusicClipSeq, clipseq2: MusicClipSeq) -> MusicClipSeq:
+    """将片段序列clipseq2融合到音乐片段序列clipseq1中。融合过程也会判断新片段长度。
+    Args:
+        clipseq1 (MusicClipSeq): 要融合的目标音乐片段序列
+        clipseq2 (MusicClipSeq): 待融合的音乐片段序列
+    Returns:
+        MusicClipSeq: 融合后的音乐片段序列
+    """
+    while len(clipseq2) > 0:
+        clip = clipseq2[0]
+        clipseq1 = merge_musicclip_into_clipseq(clip, clipseq1)
+        del clipseq2[0]
+    return clipseq1
+def merge_lyricseq_beatseq(
+    lyric_clipseq: MusicClipSeq, beat_clipseq: MusicClipSeq
+) -> MusicClipSeq:
+    """将beat序列融合到歌词序列中
+    Args:
+        lyric_clipseq (MusicClipSeq): 歌词序列
+        beat_clipseq (MusicClipSeq): beat序列
+    Returns:
+        MusicClipSeq: 融合后的音乐片段序列
+    """
+    newclipseq = merge_music_clipseq(lyric_clipseq, beat_clipseq)
+    # for i, clip in enumerate(newclipseq):
+    # logger.debug("i={}, time_start={}, duration={}".format(i, clip.time_start, clip.duration))
+    return newclipseq
+def get_stageseq_from_clipseq(clipseq: MusicClipSeq) -> List[Dict]:
+    """对clip.stage做近邻融合，返回总时间
+    Returns:
+        List[Dict]: 根据音乐结构进行分割的片段序列
+    """
+    stages = [clip.stage for clip in clipseq]
+    merge_stages_idx = merge_list_continuous_same_element(stages)
+    merge_stages = []
+    for n, stages_idx in enumerate(merge_stages_idx):
+        dct = {
+            "clipid": n,
+            "time_start": clipseq[stages_idx["start"]].time_start,
+            "time_end": clipseq[stages_idx["end"]].time_end,
+            "stage": stages_idx["element"],
+            "original_clipid": list(
+                range(stages_idx["start"], stages_idx["end"] + 1)
+            ),  # mss都是左闭、 右闭的方式
+        }
+        dct["duration"] = dct["time_end"] - dct["time_start"]
+        merge_stages.append(dct)
+    return merge_stages

MuseV/MMCM/mmcm/music/music_map/convert_type.py ADDED Viewed

	@@ -0,0 +1,57 @@

+from ...data.clip.clip_process import (
+    insert_startclip,
+    insert_endclip,
+    reset_clipseq_id,
+)
+from .music_clip import MusicClip, MusicClipSeq
+def read_osu_hitobjs(path: str) -> list:
+    """读取osu的音游谱面
+    Args:
+        path (str): 谱面低质
+    Returns:
+        list: 只包含HitObjects的行字符串信息
+    """
+    lines = []
+    is_hit_info_start = False
+    with open(path, "r") as f:
+        for line in f:
+            if is_hit_info_start:
+                lines.append(line.strip())
+            if "[HitObjects]" in line:
+                is_hit_info_start = True
+    return lines
+def osu2itech(src: list, duration: float = None) -> MusicClipSeq:
+    """将osu的音游谱面转换为我们的目标格式
+    Args:
+        src (list): 音游谱面路径或者是读取的目标行字符串列表
+        duration (float, optional): 歌曲长度. Defaults to None.
+    Returns:
+        MusicClipSeq: 音乐片段序列
+    """
+    if isinstance(src, str):
+        src = read_osu_hitobjs(src)
+    timepoints = [float(line.split(",")[2]) for line in src]
+    clips = []
+    for i in range(len(timepoints) - 1):
+        clip = MusicClip(
+            time_start=round(timepoints[i] / 1000, 3),
+            timepoint_type=0,
+            duration=round((timepoints[i + 1] - timepoints[i]) / 1000, 3),
+            clipid=i,
+        )
+        clips.append(clip)
+    if len(clips) > 0:
+        clips = insert_startclip(clips)
+        if duration is not None:
+            clips = insert_endclip(clips, duration=duration)
+        clips = reset_clipseq_id(clips)
+    return MusicClipSeq(clips)

MuseV/MMCM/mmcm/music/music_map/load_music_map.py ADDED Viewed

	@@ -0,0 +1,38 @@

+from typing import List
+from .music_map import MusicMap, MusicMapSeq
+def load_music_map(
+    music_map_paths,
+    music_paths,
+    emb_paths,
+    start: float=None,
+    end: None=None,
+    target_stages: List[str] = None,
+    **kwargs,
+):
+    """读取视频谱面，转化成MusicInfo。当 musicinfo_path_lst 为列表时，表示多歌曲
+    Args:
+        musicinfo_path_lst (str or [str]): 视频谱面路径文件列表
+        music_path_lst (str or [str]): 视频文件路径文件列表，须与musicinfo_path_lst等长度
+    Returns:
+        MusicInfo: 视频谱面信息
+    """
+    dct ={
+        "start": start,
+        "end": end,
+        "target_stages": target_stages,
+    }
+    if isinstance(music_map_paths, list):
+        music_map = MusicMapSeq.from_json_paths(media_map_class=MusicMapSeq, media_paths=music_paths, media_map_paths=music_map_paths, emb_paths=emb_paths, **dct, **kwargs)
+        if len(music_map) == 1:
+            music_map = music_map[0]
+    else:
+        music_map = MusicMap.from_json_path(path=music_map_paths, emb_path=emb_paths, media_path=music_paths, **dct, **kwargs)
+    return music_map

MuseV/MMCM/mmcm/music/music_map/lyric_map.py ADDED Viewed

	@@ -0,0 +1,149 @@

+import numpy as np
+from sklearn.preprocessing import normalize, minmax_scale
+from scipy.signal import savgol_filter
+# TODO：待更新音乐谱面的类信息
+from ...data.clip.clip_process import (
+    complete_clipseq,
+    find_idx_by_clip,
+    insert_endclip,
+    insert_startclip,
+    reset_clipseq_id,
+)
+from .music_clip import Clip, ClipSeq
+from .music_clip import MusicClipSeq
+from .music_map import MusicMap
+def generate_lyric_map(
+    path: str, duration: float = None, gap_th: float = 2
+) -> MusicClipSeq:
+    """从歌词文件中生成音乐谱面
+    Args:
+        path (str): 歌词文件路径
+        duration (float, optional): 歌词对应音频的总时长. Defaults to None.
+        gap_th (float, optional): 歌词中间的空白部分是否融合到上一个片段中. Defaults to 3.
+    Returns:
+        MusicClipSeq: 以歌词文件生成的音乐谱面
+    """
+    from ..music_map.lyric_process import lyricfile2musicinfo
+    lyric_info = lyricfile2musicinfo(path)
+    lyric_info = MusicMap(lyric_info, duration=duration)
+    clipseq = lyric_info.clipseq
+    lyric_info.meta_info.duration = duration
+    # set part of nonlyric as clip whose timepoint is 0
+    for i in range(len(clipseq)):
+        clipseq[i].timepoint_type = -1
+    lyric_info.clipseq = complete_clipseq(
+        clipseq=clipseq, duration=duration, gap_th=gap_th
+    )
+    return lyric_info
+def insert_field_2_clipseq(clipseq: ClipSeq, reference: ClipSeq, field: str) -> ClipSeq:
+    """将reference中每个clip的字段信息根据赋给clipseq中最近的clip
+    Args:
+        clipseq (ClipSeq): 目标clip序列
+        reference (ClipSeq): 参考clip序列
+        field (str): 目标字段
+    Returns:
+        ClipSeq: 更新目标字段新值后的clip序列
+    """
+    for i, clip in enumerate(clipseq):
+        idx = find_idx_by_clip(reference, clip=clip)
+        if idx is not None:
+            if getattr(reference[idx], field) is not None:
+                clipseq[i].__dict__[field] = getattr(reference[idx], field)
+    return clipseq
+def insert_rythm_2_clipseq(clipseq, reference):
+    """参考MSS字段的结构信息设置rythm信息。目前策略非常简单，主歌(Vx)0.25，副歌(Cx)0.75，其他为None
+    Args:
+        clipseq (ClipSeq): 目标clip序列，设置rythm字段
+        reference (ClipSeq): 参考clip序列，参考stage字段
+    Returns:
+        ClipSeq: 更新rythm字段新值后的clip序列
+    """
+    def stage2rythm(stage):
+        if "V" in stage:
+            return 0.25
+        elif "C" in stage:
+            return 0.75
+        else:
+            return None
+    for i, clip in enumerate(clipseq):
+        idx = find_idx_by_clip(reference, clip=clip)
+        if idx is not None:
+            if reference[idx].rythm is not None:
+                clipseq[i].rythm = stage2rythm(reference[idx].stage)
+    return clipseq
+def insert_rythm_from_clip(clipseq: MusicClipSeq, beat: np.array) -> MusicClipSeq:
+    """给MusicClipSeq中的每个Clip新增节奏信息。目前使用
+        1. 单位时间内的歌词数量特征, 使用 min-max 归一化到 0 - 1 之间
+        2. 单位时间内的关键点数量，目前使用beatnet,使用 min-max 归一化到 0 - 1 之间
+        3. 对1、2中的特征相加，并根据歌曲结构不同进行加权
+    Args:
+        clipseq (MusicClipSeq): 待处理的 MusicClipSeq
+        beat (np.array): beat检测结果，Nx2,，用于结算单位时间内的关键点数。
+            1st column is time,
+            2rd is type,
+                0, end point
+                1, strong beat
+                2,3,4 weak beat
+    Returns:
+        MusicClipSeq: 新增 rythm 的 MusicClipSeq
+    """
+    mss_cofficient = {
+        "intro": 1.0,
+        "bridge": 1.0,
+        "end": 0.8,
+        "VA": 1.0,
+        "VB": 1.0,
+        "CA": 1.6,
+        "CB": 1.6,
+    }
+    # text_num_per_second
+    text_num_per_second_lst = [clip.tnps for clip in clipseq if clip.tnps != 0]
+    common_tnps = np.min(text_num_per_second_lst)
+    tnps = np.array([clip.tnps if clip.tnps != 0 else common_tnps for clip in clipseq])
+    tnps = minmax_scale(tnps)
+    # beat point _num_per_second
+    beat_pnps = np.zeros(len(clipseq))
+    for i, clip in enumerate(clipseq):
+        time_start = clip.time_start
+        time_end = clip.time_end
+        target_beat = beat[(beat[:, 0] >= time_start) & (beat[:, 0] < time_end)]
+        beat_pnps[i] = len(target_beat) / clip.duration
+    beat_pnps = minmax_scale(beat_pnps)
+    # cofficient
+    cofficients = np.array(
+        [
+            mss_cofficient[clip.stage]
+            if clip.stage in mss_cofficient and clip.stage is not None
+            else 1.0
+            for clip in clipseq
+        ]
+    )
+    rythm = cofficients * (tnps + beat_pnps)
+    rythm = minmax_scale(rythm)
+    rythm = savgol_filter(rythm, window_length=5, polyorder=3)
+    rythm = minmax_scale(rythm)
+    for i, clip in enumerate(clipseq):
+        clip.dynamic = rythm[i]
+    return clipseq

MuseV/MMCM/mmcm/music/music_map/lyric_process.py ADDED Viewed

	@@ -0,0 +1,515 @@

+from genericpath import isfile
+import re
+import os
+from ...text.utils.read_text import read_xml2json
+# 一个正则表达式非常好用的网站
+# https://regex101.com/r/cW8jA6/2
+CHINESE_PATTERN = r"[\u4e00-\u9fff]+"
+NOT_CHINESE_PATTERN = r"[^\u4e00-\u9fa5]"
+ENGLISH_CHARACHTER_PATTERN = r"[a-zA-Z]+"
+WORD_PATTERN = r"\w+"  # equal to [a-zA-Z0-9_].
+NOT_WORD_PATTERN = r"\W+"
+def has_target_string(lyric: str, pattern: str) -> bool:
+    """本句歌词是否有目标字符串
+    Args:
+        lyric (str):
+        pattern (str): 目标字符串的正则表达式式patteren
+    Returns:
+        bool: 有没有目标字符串
+    """
+    matched = re.findall(pattern, lyric)
+    flag = len(matched) > 0
+    return flag
+def has_chinese_char(lyric: str) -> bool:
+    """是否有中文字符
+    Args:
+        lyric (str):
+    Returns:
+        bool: 是否有中文字符
+    """
+    return has_target_string(lyric, CHINESE_PATTERN)
+def has_non_chinese_char(lyric: str) -> bool:
+    """是否有非中文字符，参考https://git.woa.com/innovative_tech/CopyrightGroup/LyricTools/blob/master/lyric_tools/dataProcess.py#L53
+    Args:
+        lyric (str):
+    Returns:
+        bool: 是否有中文字符
+    """
+    return has_target_string(lyric, NOT_CHINESE_PATTERN)
+def has_english_alphabet_char(lyric: str) -> bool:
+    """是否有英文字母表字符
+    Args:
+        lyric (str):
+    Returns:
+        bool:
+    """
+    return has_target_string(lyric, ENGLISH_CHARACHTER_PATTERN)
+def check_is_lyric_row(lyric: str) -> bool:
+    """该字符串是否是歌词
+    Args:
+        lyric (str): 待判断的字符串
+    Returns:
+        bool: 该字符串是否是歌词
+    """
+    is_not_lyric = [
+        re.search(r"\[ti[:：]?", lyric),
+        re.search(r"\[ar[:：]?", lyric),
+        re.search(r"\[al[:：]?", lyric),
+        re.search(r"\[by[:：]?", lyric),
+        re.search(r"\[offset[:：]?", lyric),
+        re.search(r"词[:：]?\(\d+,\d+\)[:：]?", lyric),
+        re.search(r"曲[:：]?\(\d+,\d+\)[:：]?", lyric),
+        re.search(r"作\(\d+,\d+\)词[:：]?", lyric),
+        re.search(r"作\(\d+,\d+\)曲[:：]?", lyric),
+        re.search(r"演\(\d+,\d+\)唱[:：]?", lyric),
+        re.search(r"编\(\d+,\d+\)曲[:：]?", lyric),
+        re.search(r"吉\(\d+,\d+\)他[:：]", lyric),
+        re.search(r"人\(\d+,\d+\)声\(\d+,\d+\)录\(\d+,\d+\)音\(\d+,\d+\)师[:：]?", lyric),
+        re.search(r"人\(\d+,\d+\)声\(\d+,\d+\)录\(\d+,\d+\)音\(\d+,\d+\)棚[:：]?", lyric),
+        re.search(r"Vocal\s+\(\d+,\d+\)edite[:：]?", lyric),
+        re.search(r"混\(\d+,\d+\)音\(\d+,\d+\)/\(\d+,\d+\)母\(\d+,\d+\)带[:：]?", lyric),
+        re.search(r"混\(\d+,\d+\)音", lyric),
+        re.search(r"和\(\d+,\d+\)声\(\d+,\d+\)编\(\d+,\d+\)写[:：]?", lyric),
+        re.search(
+            r"词\(\d+,\d+\)版\(\d+,\d+\)权\(\d+,\d+\)管\(\d+,\d+\)理\(\d+,\d+\)方[:：]?", lyric
+        ),
+        re.search(
+            r"曲\(\d+,\d+\)版\(\d+,\d+\)权\(\d+,\d+\)管\(\d+,\d+\)理\(\d+,\d+\)方[:：]?", lyric
+        ),
+        re.search(r"联\(\d+,\d+\)合\(\d+,\d+\)出\(\d+,\d+\)品[:：]?", lyric),
+        re.search(r"录\(\d+,\d+\)音\(\d+,\d+\)作\(\d+,\d+\)品", lyric),
+        re.search(
+            r"录\(\d+,\d+\)音\(\d+,\d+\)作\(\d+,\d+\)品\(\d+,\d+\)监\(\d+,\d+\)制[:：]?", lyric
+        ),
+        re.search(r"制\(\d+,\d+\)作\(\d+,\d+\)人[:：]?", lyric),
+        re.search(r"制\(\d+,\d+\)作\(\d+,\d+\)人[:：]?", lyric),
+        re.search(r"不\(\d+,\d+\)得\(\d+,\d+\)翻\(\d+,\d+\)唱", lyric),
+        re.search(r"未\(\d+,\d+\)经\(\d+,\d+\)许\(\d+,\d+\)可", lyric),
+        re.search(r"酷\(\d+,\d+\)狗\(\d+,\d+\)音\(\d+,\d+\)乐", lyric),
+        re.search(r"[:：]", lyric),
+    ]
+    is_not_lyric = [x is not None for x in is_not_lyric]
+    is_not_lyric = any(is_not_lyric)
+    is_lyric = not is_not_lyric
+    return is_lyric
+def lyric2clip(lyric: str) -> dict:
+    """convert a line of lyric into a clip
+    Clip定义可以参考 https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/media/clip.py
+    Args:
+        lyric (str): _description_
+    Returns:
+        dict: 转化成Clip 字典
+    """
+    time_str_groups = re.findall(r"\d+,\d+", lyric)
+    line_time_start = round(int(time_str_groups[0].split(",")[0]) / 1000, 3)
+    line_duration = round(int(time_str_groups[0].split(",")[-1]) / 1000, 3)
+    line_end_time = line_time_start + line_duration
+    last_word_time_start = round(int(time_str_groups[-1].split(",")[0]) / 1000, 3)
+    last_word_duration = round(int(time_str_groups[-1].split(",")[-1]) / 1000, 3)
+    last_word_end_time = last_word_time_start + last_word_duration
+    actual_duration = min(line_end_time, last_word_end_time) - line_time_start
+    lyric = re.sub(r"\[\d+,\d+\]", "", lyric)
+    # by yuuhong: 把每个字的起始时间点、结束时间点、具体的字拆分出来
+    words_with_timestamp = get_words_with_timestamp(lyric)
+    lyric = re.sub(r"\(\d+,\d+\)", "", lyric)
+    dct = {
+        "time_start": line_time_start,
+        "duration": actual_duration,
+        "text": lyric,
+        "original_text": lyric,
+        "timepoint_type": -1,
+        "clips": words_with_timestamp,
+    }
+    return dct
+# by yuuhong
+# 把一句QRC中的每个字拆分出来
+# lyric示例：漫(17316,178)步(17494,174)走(17668,193)在(17861,183) (18044,0)莎(18044,153)玛(18197,159)丽(18356,176)丹(18532,200)
+def get_words_with_timestamp(lyric):
+    words_with_timestamp = []
+    elements = lyric.split(")")
+    for element in elements:
+        sub_elements = element.split("(")
+        if len(sub_elements) != 2:
+            continue
+        text = sub_elements[0]
+        timestamp = sub_elements[1]
+        if re.match(r"\d+,\d+", timestamp):
+            # 有效时间戳
+            time_start_str = timestamp.split(",")[0]
+            time_start = round(int(time_start_str) / 1000, 3)
+            duration_str = timestamp.split(",")[1]
+            duration = round(int(duration_str) / 1000, 3)
+            clip = {"text": text, "time_start": time_start, "duration": duration}
+            words_with_timestamp.append(clip)
+    return words_with_timestamp
+def lyric2clips(lyric: str, th: float = 0.75) -> list:
+    """将一句歌词转换为至少1个的clip。拆分主要是针对中文空格拆分，如果拆分后片段过短，也会整句处理。
+    Args:
+        lyric (str): such as [173247,3275]去(173247,403)吗(173649,677) 配(174326,189)吗(174516,593) 这(175108,279)
+        th (float, optional): 后面如果拆分后片段过短，也会整句处理. Defaults to 1.0.
+    Returns:
+        list: 歌词Clip序列
+    """
+    # 目前只对中文的一句歌词按照空格拆分，如果是英文空格则整句处理
+    # 后面如果拆分后片段过短，也会整句处理
+    if has_english_alphabet_char(lyric):
+        return [lyric2clip(lyric)]
+    splited_lyric = lyric.split(" ")
+    if len(splited_lyric) == 1:
+        return [lyric2clip(splited_lyric[0])]
+    line_time_str, sub_lyric = re.split(r"]", splited_lyric[0])
+    line_time_groups = re.findall(r"\d+,\d+", line_time_str)
+    line_time_start = round(int(line_time_groups[0].split(",")[0]) / 1000, 3)
+    line_duration = round(int(line_time_groups[0].split(",")[-1]) / 1000, 3)
+    splited_lyric[0] = sub_lyric
+    # 歌词xml都是歌词仅跟着时间，如果有空格 空格也应该是在时间后面，但有时候空格却在字后面、在时间前，因此需要修正
+    # 错误的：[173247,3275]去(173247,403)吗 (173649,677)配(174326,189)吗 (174516,593)这(175108,279)
+    # 错误的：[46122,2082]以(46122,213)身(46335,260)淬(46595,209)炼(46804,268)天(47072,250)地(47322,370)造(47692,341)化 (48033,172)
+    # 修正成：[173247,3275]去(173247,403)吗(173649,677) 配(174326,189)吗(174516,593) 这(175108,279)
+    for i in range(len(splited_lyric)):
+        if splited_lyric[i] == "":
+            del splited_lyric[i]
+            break
+        if splited_lyric[i][-1] != ")":
+            next_lyric_time_start = re.search(
+                r"\(\d+,\d+\)", splited_lyric[i + 1]
+            ).group(0)
+            splited_lyric[i] += next_lyric_time_start
+            splited_lyric[i + 1] = re.sub(
+                next_lyric_time_start, "", splited_lyric[i + 1]
+            )
+            splited_lyric[i + 1] = re.sub("\(\)", "", splited_lyric[i + 1])
+    lyric_text = re.sub(r"\[\d+,\d+\]", "", lyric)
+    lyric_text = re.sub(r"\(\d+,\d+\)", "", lyric_text)
+    clips = []
+    has_short_clip = False
+    for sub_lyric in splited_lyric:
+        sub_lyric_groups = re.findall(r"\d+,\d+", sub_lyric)
+        sub_lyric_1st_word_time_start = round(
+            int(sub_lyric_groups[0].split(",")[0]) / 1000, 3
+        )
+        sub_lyric_last_word_time_start = round(
+            int(sub_lyric_groups[-1].split(",")[0]) / 1000, 3
+        )
+        sub_lyric_last_word_duration = round(
+            int(sub_lyric_groups[-1].split(",")[-1]) / 1000, 3
+        )
+        sub_lyric_last_word_time_end = (
+            sub_lyric_last_word_time_start + sub_lyric_last_word_duration
+        )
+        sub_lyric_duration = (
+            sub_lyric_last_word_time_end - sub_lyric_1st_word_time_start
+        )
+        if sub_lyric_duration <= th:
+            has_short_clip = True
+            break
+        sub_lyric_text = re.sub(r"\[\d+,\d+\]", "", sub_lyric)
+        sub_lyric_text = re.sub(r"\(\d+,\d+\)", "", sub_lyric_text)
+        # 使用原始lyric，而不是sub_lyric_text 主要是保留相关clip的歌词信息，便于语义连续
+        dct = {
+            "time_start": sub_lyric_1st_word_time_start,
+            "duration": sub_lyric_duration,
+            "text": sub_lyric_text,
+            "original_text": lyric_text,
+            "timepoint_type": -1,
+        }
+        clips.append(dct)
+    if has_short_clip:
+        clips = [lyric2clip(lyric)]
+    return clips
+def is_songname(lyric: str) -> bool:
+    """是否是歌名，歌名文本含有ti, 如[ti:霍元甲 (《霍元甲》电影主题曲)]
+    Args:
+        lyric (str):
+    Returns:
+        bool:
+    """
+    return has_target_string(lyric, r"\[ti[:：]?")
+def get_songname(lyric: str) -> str:
+    """获取文本中的歌名，输入必须类似[ti:霍元甲 (《霍元甲》电影主题曲)]
+    Args:
+        lyric (str): 含有歌名的QRC文本行
+    Returns:
+        str: 歌名
+    """
+    return lyric.split("(")[0][4:-1]
+def is_album(lyric: str) -> bool:
+    """是否含有专辑名，文本必须类似[al:霍元甲]
+    Args:
+        lyric (str): _description_
+    Returns:
+        bool: _description_
+    """
+    return has_target_string(lyric, r"\[al[:：]?")
+def get_album(lyric: str) -> str:
+    """提取专辑名，文本必须类似[al:霍元甲]
+    Args:
+        lyric (str): 含有专辑名的QRC文本行
+    Returns:
+        str: 专辑名
+    """
+    return lyric[4:-1]
+def is_singer(lyric: str) -> bool:
+    """是否有歌手名，目标文本类似 [ar:周杰伦]
+    Args:
+        lyric (str): _description_
+    Returns:
+        bool: _description_
+    """
+    return has_target_string(lyric, r"\[ar[:：]?")
+def get_singer(lyric: str) -> str:
+    """提取歌手信息，文本必须类似[ar:周杰伦]
+    Args:
+        lyric (str): 含有歌手名的QRC文本行
+    Returns:
+        str: 歌手名
+    """
+    return lyric[4:-1]
+def lyric2musicinfo(lyric: str) -> dict:
+    """convert lyric content from str into musicinfo, a dict
+    参考https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/media/media_info.py#L19
+    {
+        "meta_info": {},
+        "sub_meta_info": [],
+        "clips": [
+            clip
+        ]
+    }
+    Args:
+        lyric (str): 来自QRC的歌词字符串
+    Returns:
+        musicinfo: 音乐谱面字典，https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/media/media_info.py#L19
+    """
+    lyrics = lyric["QrcInfos"]["LyricInfo"]["Lyric_1"]["@LyricContent"]
+    musicinfo = {
+        "meta_info": {
+            "mediaid": None,
+            "media_name": None,
+            "singer": None,
+        },
+        "sub_meata_info": {},
+        "clips": [],
+    }
+    # lyrics = [line.strip() for line in re.split(r"[\t\n\s+]", lyrics)]
+    lyrics = ["[" + line.strip() for line in re.split(r"\[", lyrics)]
+    next_is_title_row = False
+    lyric_clips = []
+    for line in lyrics:
+        if is_songname(line):
+            musicinfo["meta_info"]["media_name"] = get_songname(line)
+            continue
+        if is_singer(line):
+            musicinfo["meta_info"]["singer"] = get_singer(line)
+            continue
+        if is_album(line):
+            musicinfo["meta_info"]["album"] = get_album(line)
+            continue
+        is_lyric_row = check_is_lyric_row(line)
+        if next_is_title_row:
+            next_is_title_row = False
+            continue
+        # remove tille row
+        if not next_is_title_row and re.search(r"\[offset[:：]", line):
+            next_is_title_row = True
+        if is_lyric_row and re.match(r"\[\d+,\d+\]", line):
+            lyric_clip = lyric2clip(line)
+            lyric_clips.append(lyric_clip)
+            clips = lyric2clips(line)
+            musicinfo["clips"].extend(clips)
+    musicinfo["meta_info"]["lyric"] = lyric_clips
+    return musicinfo
+def lrc_timestr2time(time_str: str) -> float:
+    """提取lrc中的时间戳文本，类似[00:00.00]，转化成秒的浮点数
+    Args:
+        time_str (str):
+    Returns:
+        float: 时间浮点数
+    """
+    m, s, ms = (float(x) for x in re.split(r"[:.]", time_str))
+    return round((m * 60 + s + ms / 1000), 3)
+def get_lrc_line_time(text: str, time_pattern: str) -> str:
+    """提取lrc中的时间字符串, 类似 \"[00:00.00]本字幕由天琴实验室独家AI字幕技术生成\"
+    Args:
+        text (str): 输入文本
+        time_pattern (str): 时间字符串正则表达式
+    Returns:
+        str: 符合正则表达式的时间信息文本
+    """
+    time_str = re.search(time_pattern, text).group(0)
+    return lrc_timestr2time(time_str)
+def lrc_lyric2clip(lyric: str, time_pattern: str, duration: float) -> dict:
+    """将一行lrc文本字符串转化为Clip 字典
+    Args:
+        lyric (str):  类似 \"[00:00.00]本字幕由天琴实验室独家AI字幕技术生成\"
+        time_pattern (str): 时间字符串正则表达式，类似 r"\d+:\d+\.\d+"
+        duration (float): clip的时长信息，
+    Returns:
+        dict: 转化后Clip
+            Clip定义可以参考 https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/media/clip.py
+    """
+    time_str = get_lrc_line_time(lyric, time_pattern=time_pattern)
+    text = re.sub(time_pattern, "", lyric)
+    text = text[2:]
+    clip = {
+        "time_start": time_str,
+        "duration": duration,
+        "text": text,
+        "timepoint_type": -1,
+    }
+    return clip
+def lrc2musicinfo(lyric: str, time_pattern: str = "\d+:\d+\.\d+") -> dict:
+    """将lrc转化为音乐谱面
+    Args:
+        lyric (str): lrc文本路径
+        time_pattern (str, optional): lrc时间戳字符串正则表达式. Defaults to "\d+:\d+\.\d+".
+    Returns:
+        dict: 生成的音乐谱面字典，定义可参考 https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/music/music_info.py
+    """
+    if isinstance(lyric, str):
+        if os.path.isfile(lyric):
+            with open(lyric, "r") as f:
+                lyric = [line.strip() for line in f.readlines()]
+            return lrc2musicinfo(lyric)
+        else:
+            lyric = lyric.split("\n")
+            return lrc2musicinfo(lyric)
+    else:
+        musicinfo = {
+            "meta_info": {
+                "mediaid": None,
+                "media_name": None,
+                "singer": None,
+            },
+            "sub_meata_info": {},
+            "clips": [],
+        }
+        # lyrics = [line.strip() for line in re.split(r"[\t\n\s+]", lyrics)]
+        lyric_clips = []
+        rows = len(lyric)
+        for i, line in enumerate(lyric):
+            if is_songname(line):
+                musicinfo["meta_info"]["media_name"] = line[4:-1]
+                continue
+            if is_singer(line):
+                musicinfo["meta_info"]["singer"] = line[4:-1]
+                continue
+            if is_album(line):
+                musicinfo["meta_info"]["album"] = line[4:-1]
+                continue
+            if len(re.findall(time_pattern, line)) > 0:
+                if i < rows - 1:
+                    time_start = get_lrc_line_time(line, time_pattern=time_pattern)
+                    next_line_time_start = get_lrc_line_time(
+                        lyric[i + 1], time_pattern=time_pattern
+                    )
+                    duration = next_line_time_start - time_start
+                else:
+                    duration = 1
+                clip = lrc_lyric2clip(
+                    line, duration=duration, time_pattern=time_pattern
+                )
+                musicinfo["clips"].append(clip)
+        musicinfo["meta_info"]["lyric"] = lyric_clips
+        return musicinfo
+def lyricfile2musicinfo(path: str) -> dict:
+    """将歌词文件转化为音乐谱面，歌词文件可以是QRC的xml文件、也可以是lrc对应的lrc文件
+        TODO： 待支持osu
+    Args:
+        path (str): 歌词文件路径
+    Returns:
+        dict: 音乐谱面字典，定义可参考 https://git.woa.com/innovative_tech/VideoMashup/blob/master/videomashup/music/music_info.py
+    """
+    filename, ext = os.path.basename(path).split(".")
+    if ext == "xml":
+        lyric = read_xml2json(path)
+        musicinfo = lyric2musicinfo(lyric)
+    elif ext == "lrc":
+        musicinfo = lrc2musicinfo(path)
+    musicinfo["meta_info"]["mediaid"] = filename
+    return musicinfo

MuseV/MMCM/mmcm/music/music_map/meta_info.py ADDED Viewed

	@@ -0,0 +1,21 @@

+from __future__ import annotations
+from ...data import MetaInfo
+class MusicMetaInfo(MetaInfo):
+    def __init__(self, mediaid=None, media_name=None, media_duration=None, signature=None, media_path: str = None, media_map_path: str = None,
+        singer=None,
+        lyric_path=None,
+        genre=None,
+        language=None,
+        start: float = None, end: float = None, ext=None, **kwargs):
+        super().__init__(mediaid, media_name, media_duration, signature, media_path, media_map_path, start, end, ext, **kwargs)
+        self.singer = singer
+        self.genre = genre
+        self.language = language
+        self.lyric_path = lyric_path
+    @classmethod
+    def from_data(cls, data) -> MusicMetaInfo:
+        return MusicMetaInfo(**data)

MuseV/MMCM/mmcm/music/music_map/mss_map.py ADDED Viewed

	@@ -0,0 +1,185 @@

+import logging
+from .music_clip import MusicClip, MusicClipSeq
+from .music_map import MusicMap
+from ...data.clip.clip_process import find_idx_by_time
+logger = logging.getLogger(__name__)  # pylint: disable=invalid-name
+def insert_mss_2_clipseq(
+    clipseq: MusicClipSeq, mss_clipseq: MusicClipSeq
+) -> MusicClipSeq:
+    """将mss中的结构字段信息赋予到目标clipseq中的最近clip
+    Args:
+        clipseq (ClipSeq): 目标clip序列
+        reference (ClipSeq): 参考clip序列
+        field (str): 目标字段
+    Returns:
+        ClipSeq: 更新目标字段新值后的clip序列
+    """
+    for i, clip in enumerate(clipseq):
+        idx = find_idx_by_time(mss_clipseq, clip.time_start)
+        if idx is not None:
+            clipseq[i].stage = mss_clipseq[idx].stage
+        else:
+            clipseq[i].stage = "unknow"
+    return clipseq
+def get_mss_musicinfo(songid: str) -> MusicMap:
+    """通过调用media_data中的接口 获取天琴实验室的歌曲结构信息
+    Args:
+        songid (str): 歌词id
+    Returns:
+        MusicMap: mss结构信息生成的音乐谱面
+    """
+    try:
+        from media_data.oi.tianqin_database import get_mss
+        mss = get_mss(songid=songid)
+    except Exception as e:
+        logger.warning("get mss failed, mss={}".format(songid))
+        logger.exception(e)
+        mss = None
+    mss_musicinfo = MusicMap(mss) if mss is not None else None
+    return mss_musicinfo
+def merge_mss(musicinfo: MusicMap, mss: MusicMap) -> MusicMap:
+    """融合mss音乐谱面到目标音乐谱面
+    Args:
+        musicinfo (MusicMap): 目标音乐谱面
+        mss (MusicMap): 待融合的mss音乐谱面
+    Returns:
+        MusicMap: 融合后的音乐谱面
+    """
+    musicinfo.meta_info.bpm = mss.meta_info.bpm
+    if len(mss.clipseq) > 0:
+        musicinfo.clipseq = insert_mss_2_clipseq(musicinfo.clipseq, mss.clipseq)
+    return musicinfo
+def generate_mss_from_lyric(lyrics: list, audio_duration: float, th=8) -> MusicClipSeq:
+    # "intro", "VA", "CA", "bridge", "VB", "CB", "end"]
+    mss = []
+    n_lyric = len(lyrics)
+    for lyric_idx, line_lyric_dct in enumerate(lyrics):
+        time_start = line_lyric_dct["time_start"]
+        duration = line_lyric_dct["duration"]
+        time_end = time_start + duration
+        # text = line_lyric_dct["text"]
+        if lyric_idx == 0:
+            sub_mss = {
+                "stage": "intro",
+                "time_start": 0,
+                "duration": time_start,
+            }
+            mss.append(sub_mss)
+            continue
+        if lyric_idx == n_lyric - 1:
+            sub_mss = {
+                "stage": "end",
+                "time_start": time_end,
+                "duration": audio_duration - time_end,
+            }
+            mss.append(sub_mss)
+            continue
+        if lyrics[lyric_idx + 1]["time_start"] - time_end >= th:
+            sub_mss = {
+                "stage": "bridge",
+                "time_start": time_end,
+                "duration": lyrics[lyric_idx + 1]["time_start"] - time_end,
+            }
+            mss.append(sub_mss)
+    mss_lyric = []
+    for sub_idx, sub_mss in enumerate(mss):
+        if sub_idx == len(mss) - 1:
+            continue
+        time_end = sub_mss["time_start"] + sub_mss["duration"]
+        next_time_start = mss[sub_idx + 1]["time_start"]
+        if next_time_start - time_end > 0.1:
+            mss_lyric.append(
+                {
+                    "stage": "lyric",
+                    "time_start": time_end,
+                    "duration": next_time_start - time_end,
+                }
+            )
+    mss.extend(mss_lyric)
+    mss = sorted(mss, key=lambda x: x["time_start"])
+    mss = MusicClipSeq(mss)
+    return mss
+def refine_mss_info_from_tianqin(
+    mss_info: MusicMap, lyricseq: MusicClipSeq
+) -> MusicMap:
+    """优化天琴的歌曲结信息,
+    优化前：天琴歌曲结构里面只有每句歌词和结构信息，时间前后不连续，对于整首歌去时间结构不完备。
+    优化后：增加intro,bridge,end，将相近的结构信息合并，时间前后连续，时间完备
+    Args:
+        mss_info (MusicMap): 天琴歌曲结构
+        lyricseq (ClipSeq): 原始歌曲信息，用于计算Intro,bridge,end。其实也可以从mss_info中获取。
+    Returns:
+        MusicMap: 优化后的歌曲结构信息
+    """
+    lyric_mss_clipseq = generate_mss_from_lyric(
+        lyricseq, audio_duration=mss_info.meta_info.duration
+    )
+    new_mss_clipseq = []
+    # lyric_mss_dct = lyric_mss_clipseq.to_dct()
+    # mss_dct = mss_info.clipseq.to_dct()
+    for l_clip_idx, lyric_clip in enumerate(lyric_mss_clipseq):
+        if lyric_clip.stage != "lyric":
+            new_mss_clipseq.append(lyric_clip)
+        else:
+            new_clip_time_start = lyric_clip.time_start
+            last_stage = "ANewClipStart"
+            for clip_idx, clip in enumerate(mss_info.clipseq):
+                if clip.time_start < new_clip_time_start:
+                    continue
+                if (
+                    clip.time_start >= lyric_mss_clipseq[l_clip_idx + 1].time_start
+                    or clip_idx == len(mss_info.clipseq) - 1
+                ):
+                    if clip.time_start >= lyric_mss_clipseq[l_clip_idx + 1].time_start:
+                        stage = last_stage
+                    # 像偶阵雨这首歌最后一个歌词段落 只有一句歌词
+                    if clip_idx == len(mss_info.clipseq) - 1:
+                        stage = clip.stage
+                    new_clip_time_end = lyric_mss_clipseq[l_clip_idx + 1].time_start
+                    new_stage_clip = {
+                        "time_start": new_clip_time_start,
+                        "duration": new_clip_time_end - new_clip_time_start,
+                        "stage": stage,
+                    }
+                    new_mss_clipseq.append(MusicClip(**new_stage_clip))
+                    new_clip_time_start = new_clip_time_end
+                    last_stage = clip.stage
+                    break
+                if clip.stage != last_stage:
+                    if last_stage == "ANewClipStart":
+                        last_stage = clip.stage
+                        continue
+                    new_clip_time_end = mss_info.clipseq[clip_idx].time_start
+                    new_stage_clip = {
+                        "time_start": new_clip_time_start,
+                        "duration": new_clip_time_end - new_clip_time_start,
+                        "stage": last_stage,
+                    }
+                    new_mss_clipseq.append(MusicClip(**new_stage_clip))
+                    new_clip_time_start = new_clip_time_end
+                    last_stage = clip.stage
+    new_mss_clipseq = MusicClipSeq(sorted(new_mss_clipseq, key=lambda x: x.time_start))
+    mss_info.clipseq = new_mss_clipseq
+    return mss_info

MuseV/MMCM/mmcm/music/music_map/music_clip.py ADDED Viewed

	@@ -0,0 +1,83 @@

+from __future__ import annotations
+from typing import Dict, List
+from ...data.clip import Clip, ClipSeq
+class MusicClip(Clip):
+    def __init__(self, time_start: float, duration: float, clipid: int = None, media_type: str = None, mediaid: str = None, timepoint_type: str = None, text: str = None, stage: str = None, path: str = None, duration_num: int = None, similar_clipseq: MatchedClipIds = None, dynamic: float = None, **kwargs):
+        super().__init__(time_start, duration, clipid, media_type, mediaid, timepoint_type, text, stage, path, duration_num, similar_clipseq, dynamic, **kwargs)
+    @property
+    def text_num(self):
+        return self._cal_text_num()
+    @property
+    def original_text_num(self):
+        return self._cal_text_num(text_mode=1)
+    def _cal_text_num(self, text_mode: int = 0) -> int:
+        """计算 文本 字的数量
+        Args:
+            text_mode (int, optional): 0选text， 其他选original_text. Defaults to 0.
+        Returns:
+            int: _description_
+        """
+        if text_mode == 0:
+            text = self.text
+        else:
+            text = self.original_text
+        if text is None:
+            n_text = 0
+        else:
+            text = text.strip().split(" ")
+            n_text = len(text)
+        return n_text
+    @property
+    def text_num_per_second(self):
+        """单位时间内的text数量"""
+        return self._cal_text_num_per_second(mode=0)
+    @property
+    def original_text_num_per_second(self):
+        """单位时间内的original_text数量"""
+        return self._cal_text_num_per_second(mode=1)
+    @property
+    def tnps(self):
+        """单位时间内的text数量"""
+        return self.text_num_per_second
+    @property
+    def original_tnps(self):
+        """单位时间内的original_text数量"""
+        return self.original_text_num_per_second
+    def _cal_text_num_per_second(self, mode=0):
+        """计算单位时间内的文本数量"""
+        text_num = self.text_num if mode == 0 else self.original_text_num
+        return text_num / self.duration
+    @classmethod
+    def from_data(cls, data: Dict):
+        return MusicClip(**data)
+class MusicClipSeq(ClipSeq):
+    def __init__(self, items: List[Clip] = None):
+        super().__init__(items)
+        self.clipseq = self.data
+    @classmethod
+    def from_data(cls, clipseq: List[Dict]) -> MusicClipSeq:
+        new_clipseq = []
+        for clip in clipseq:
+            video_clip = MusicClip.from_data(clip)
+            new_clipseq.append(video_clip)
+        video_clipseq = MusicClipSeq(new_clipseq)
+        return video_clipseq

MuseV/MMCM/mmcm/music/music_map/music_map.py ADDED Viewed

	@@ -0,0 +1,140 @@

+from __future__ import annotations
+from typing import List, Dict
+from moviepy.editor import concatenate_audioclips, AudioClip, AudioFileClip
+from ...data import MediaMap, MediaMapEmb, MetaInfo, MediaMapSeq
+from ...data.clip.clip_process import find_time_by_stage
+from ...data.emb.h5py_emb import H5pyMediaMapEmb
+from ...utils.util import load_dct_from_file
+from .clip_process import get_stageseq_from_clipseq
+from .music_clip import MusicClip, MusicClipSeq
+from .meta_info import MusicMetaInfo
+class MusicMap(MediaMap):
+    def __init__(
+        self,
+        meta_info: MetaInfo,
+        clipseq: MusicClipSeq,
+        lyricseq: MusicClipSeq = None,
+        stageseq: MusicClipSeq = None,
+        frameseq: MusicClipSeq = None,
+        emb: MediaMapEmb = None,
+        **kwargs,
+    ):
+        self.lyricseq = lyricseq
+        super().__init__(meta_info, clipseq, stageseq, frameseq, emb, **kwargs)
+        if self.stageseq is None:
+            self.stageseq = MusicClipSeq.from_data(
+                get_stageseq_from_clipseq(self.clipseq)
+            )
+            self.stageseq.preprocess()
+    def preprocess(self):
+        if (
+            hasattr(self.meta_info, "target_stages")
+            and self.meta_info.target_stages is not None
+        ):
+            self.set_start_end_by_target_stages()
+        super().preprocess()
+        self.spread_metainfo_2_clip(
+            target_keys=[
+                "media_path",
+                "media_map_path",
+                "emb_path",
+                "media_duration",
+                "mediaid",
+                "media_name",
+                "emb",
+            ]
+        )
+    def set_start_end_by_target_stages(self):
+        target_stages = self.meta_info.target_stages
+        if not isinstance(target_stages, List):
+            target_stages = [target_stages]
+        start, _ = find_time_by_stage(self.stageseq, target_stages[0])
+        _, end = find_time_by_stage(self.stageseq, target_stages[-1])
+        self.meta_info.start = start
+        self.meta_info.end = end
+    @property
+    def audio_clip(self) -> AudioFileClip:
+        """读取实际ClipSeq中的音频
+        Returns:
+            AudioClip: Moviepy中的audio_clip
+        """
+        audio_clip = AudioFileClip(self.meta_info.media_path)
+        audio_clip = audio_clip.subclip(self.meta_info.start, self.meta_info.end)
+        return audio_clip
+    @classmethod
+    def from_json_path(
+        cls, path: Dict, emb_path: str, media_path: str = None, **kwargs
+    ) -> MusicMap:
+        media_map = load_dct_from_file(path)
+        emb = H5pyMediaMapEmb(emb_path)
+        return cls.from_data(media_map, emb=emb, media_path=media_path, **kwargs)
+    @classmethod
+    def from_data(
+        cls, data: Dict, emb: H5pyMediaMapEmb, media_path: str = None, **kwargs
+    ) -> MusicMap:
+        meta_info = MusicMetaInfo.from_data(data.get("meta_info", {}))
+        meta_info.media_path = media_path
+        clipseq = MusicClipSeq.from_data(data.get("clipseq", []))
+        stageseq = MusicClipSeq.from_data(data.get("stageseq", []))
+        lyricseq = MusicClipSeq.from_data(data.get("lyricseq", []))
+        target_keys = ["meta_info", "clipseq", "frameseq", "stageseq", "lyricseq"]
+        dct = {k: data[k] for k in data.keys() if k not in target_keys}
+        dct.update(**kwargs)
+        video_map = MusicMap(
+            meta_info=meta_info,
+            clipseq=clipseq,
+            stageseq=stageseq,
+            lyricseq=lyricseq,
+            emb=emb,
+            **dct,
+        )
+        return video_map
+    def to_dct(
+        self, target_keys: List[str] = None, ignored_keys: List[str] = None
+    ) -> Dict:
+        dct = {}
+        dct["meta_info"] = self.meta_info.to_dct(
+            target_keys=target_keys, ignored_keys=ignored_keys
+        )
+        dct["clipseq"] = self.clipseq.to_dct(
+            target_keys=target_keys, ignored_keys=ignored_keys
+        )
+        if self.frameseq is not None:
+            dct["frameseq"] = self.frameseq.to_dct(
+                target_keys=target_keys, ignored_keys=ignored_keys
+            )
+        else:
+            dct["frameseq"] = None
+        if self.stageseq is not None:
+            dct["stageseq"] = self.stageseq.to_dct(
+                target_keys=target_keys, ignored_keys=ignored_keys
+            )
+        else:
+            dct["stageseq"] = None
+        dct["lyricseq"] = self.lyricseq.to_dct(
+            target_keys=target_keys, ignored_keys=ignored_keys
+        )
+        return dct
+class MusicMapSeq(MediaMapSeq):
+    def __init__(self, maps: List[MusicMap]) -> None:
+        super().__init__(maps)
+    @property
+    def audio_clip(self) -> AudioFileClip:
+        audio_clip_lst = [m.audi_clip for m in self.maps]
+        audio_clip = concatenate_audioclips(audio_clip_lst)
+        return audio_clip

MuseV/MMCM/mmcm/music/music_map/music_map_demp.py ADDED Viewed

	@@ -0,0 +1,58 @@

+from moviepy.editor import (
+    ColorClip,
+    concatenate_videoclips,
+    AudioFileClip,
+    CompositeVideoClip,
+)
+from ...vision.video_map.video_lyric import render_lyric2video
+from ...vision.video_map.video_writer import write_videoclip
+from .music_map import MusicMap
+def generate_music_map_videodemo(
+    music_map: MusicMap,
+    path: str,
+    audio_path: str,
+    render_lyric: bool = True,
+    width: int = 360,
+    height: int = 240,
+    fps: int = 25,
+    n_thread: int = 8,
+    colors: list = [[51, 161, 201], [46, 139, 87]],
+) -> None:
+    """输入音乐谱面，生成对应的转场视频Demo，视频内容只是简单的颜色切换
+    Args:
+        music_map (MusicInfo): 待可视化的音乐谱面
+        path (str): 可视化视频的存储路径
+        audio_path (str): 音乐谱面对应的音频路径
+        render_lyric (bool, optional): 是否渲染歌词，歌词在音乐谱面中. Defaults to True.
+        width (int, optional): 可视化视频的宽. Defaults to 360.
+        height (int, optional): 可视化视频的高. Defaults to 240.
+        fps (int, optional): 可视化视频的fps. Defaults to 25.
+        n_thread (int, optional): 可视化视频的写入线程数. Defaults to 8.
+        colors (list, optional): 可视化的视频颜色. Defaults to [[51, 161, 201], [46, 139, 87]].
+    """
+    audio_clip = AudioFileClip(audio_path)
+    video_clips = []
+    size = (width, height)
+    for i, clip in enumerate(music_map.clipseq):
+        clip = ColorClip(
+            size=size, color=colors[i % len(colors)], duration=clip.duration
+        )
+        video_clips.append(clip)
+    video_clips = concatenate_videoclips(video_clips, method="compose")
+    if render_lyric:
+        video_clips = render_lyric2video(
+            videoclip=video_clips,
+            lyric=music_map,
+            lyric_info_type="music_map",
+        )
+    video_clips = video_clips.set_audio(audio_clip)
+    write_videoclip(
+        video_clips,
+        path=path,
+        fps=fps,
+        n_thread=n_thread,
+    )

MuseV/MMCM/mmcm/music/utils/__init__.py ADDED Viewed

File without changes

MuseV/MMCM/mmcm/music/utils/path_util.py ADDED Viewed

	@@ -0,0 +1,9 @@

+import os
+from typing import Dict, Tuple
+from ...utils.path_util import get_dir_file_map
+def get_audio_path_dct(path, exts=["mp3", "flac", "wav"]) -> Dict[str, str]:
+    """遍历目标文件夹及子文件夹下所有音频文件，生成字典。"""
+    return get_dir_file_map(path, exts=exts)

MuseV/MMCM/mmcm/t2p/.gitignore ADDED Viewed

	@@ -0,0 +1,158 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+.vscode
+dataset/dataset_TM_train_cb1_temp.py
+train_gpt_cnn_temp.py
+train_gpt_cnn_mask.py
+start.sh
+start_eval.sh
+config.json
+output_GPT_Final
+output_vqfinal
+output_transformer
+glove
+checkpoints
+dataset/HumanML3D
+dataset/KIT-ML
+output
+matrix_multi.py
+body_models
+render_final_diffuse.py
+render_final_mdm.py
+pretrained
+MDM
+Motiondiffusion
+Visualize_temp.py
+new.sh
+T2M_render
+render_final_t2m.py
+pose

MuseV/MMCM/mmcm/t2p/GPT_eval_multi.py ADDED Viewed

	@@ -0,0 +1,121 @@

+import os
+import torch
+import numpy as np
+from torch.utils.tensorboard import SummaryWriter
+import json
+import clip
+import options.option_transformer as option_trans
+import models.vqvae as vqvae
+import utils.utils_model as utils_model
+import utils.eval_trans as eval_trans
+from dataset import dataset_TM_eval
+import models.t2m_trans as trans
+from options.get_eval_option import get_opt
+from models.evaluator_wrapper import EvaluatorModelWrapper
+import warnings
+warnings.filterwarnings('ignore')
+##### ---- Exp dirs ---- #####
+args = option_trans.get_args_parser()
+torch.manual_seed(args.seed)
+args.out_dir = os.path.join(args.out_dir, f'{args.exp_name}')
+os.makedirs(args.out_dir, exist_ok = True)
+##### ---- Logger ---- #####
+logger = utils_model.get_logger(args.out_dir)
+writer = SummaryWriter(args.out_dir)
+logger.info(json.dumps(vars(args), indent=4, sort_keys=True))
+from utils.word_vectorizer import WordVectorizer
+w_vectorizer = WordVectorizer('./glove', 'our_vab')
+val_loader = dataset_TM_eval.DATALoader(args.dataname, True, 32, w_vectorizer)
+dataset_opt_path = 'checkpoints/kit/Comp_v6_KLD005/opt.txt' if args.dataname == 'kit' else 'checkpoints/t2m/Comp_v6_KLD005/opt.txt'
+wrapper_opt = get_opt(dataset_opt_path, torch.device('cuda'))
+eval_wrapper = EvaluatorModelWrapper(wrapper_opt)
+##### ---- Network ---- #####
+## load clip model and datasets
+clip_model, clip_preprocess = clip.load("ViT-B/32", device=torch.device('cuda'), jit=False)  # Must set jit=False for training
+clip.model.convert_weights(clip_model)  # Actually this line is unnecessary since clip by default already on float16
+clip_model.eval()
+for p in clip_model.parameters():
+    p.requires_grad = False
+net = vqvae.HumanVQVAE(args, ## use args to define different parameters in different quantizers
+                       args.nb_code,
+                       args.code_dim,
+                       args.output_emb_width,
+                       args.down_t,
+                       args.stride_t,
+                       args.width,
+                       args.depth,
+                       args.dilation_growth_rate)
+trans_encoder = trans.Text2Motion_Transformer(num_vq=args.nb_code,
+                                embed_dim=args.embed_dim_gpt,
+                                clip_dim=args.clip_dim,
+                                block_size=args.block_size,
+                                num_layers=args.num_layers,
+                                n_head=args.n_head_gpt,
+                                drop_out_rate=args.drop_out_rate,
+                                fc_rate=args.ff_rate)
+print ('loading checkpoint from {}'.format(args.resume_pth))
+ckpt = torch.load(args.resume_pth, map_location='cpu')
+net.load_state_dict(ckpt['net'], strict=True)
+net.eval()
+net.cuda()
+if args.resume_trans is not None:
+    print ('loading transformer checkpoint from {}'.format(args.resume_trans))
+    ckpt = torch.load(args.resume_trans, map_location='cpu')
+    trans_encoder.load_state_dict(ckpt['trans'], strict=True)
+trans_encoder.train()
+trans_encoder.cuda()
+fid = []
+div = []
+top1 = []
+top2 = []
+top3 = []
+matching = []
+multi = []
+repeat_time = 20
+for i in range(repeat_time):
+    best_fid, best_iter, best_div, best_top1, best_top2, best_top3, best_matching, best_multi, writer, logger = eval_trans.evaluation_transformer_test(args.out_dir, val_loader, net, trans_encoder, logger, writer, 0, best_fid=1000, best_iter=0, best_div=100, best_top1=0, best_top2=0, best_top3=0, best_matching=100, best_multi=0, clip_model=clip_model, eval_wrapper=eval_wrapper, draw=False, savegif=False, save=False, savenpy=(i==0))
+    fid.append(best_fid)
+    div.append(best_div)
+    top1.append(best_top1)
+    top2.append(best_top2)
+    top3.append(best_top3)
+    matching.append(best_matching)
+    multi.append(best_multi)
+print('final result:')
+print('fid: ', sum(fid)/repeat_time)
+print('div: ', sum(div)/repeat_time)
+print('top1: ', sum(top1)/repeat_time)
+print('top2: ', sum(top2)/repeat_time)
+print('top3: ', sum(top3)/repeat_time)
+print('matching: ', sum(matching)/repeat_time)
+print('multi: ', sum(multi)/repeat_time)
+fid = np.array(fid)
+div = np.array(div)
+top1 = np.array(top1)
+top2 = np.array(top2)
+top3 = np.array(top3)
+matching = np.array(matching)
+multi = np.array(multi)
+msg_final = f"FID. {np.mean(fid):.3f}, conf. {np.std(fid)*1.96/np.sqrt(repeat_time):.3f}, Diversity. {np.mean(div):.3f}, conf. {np.std(div)*1.96/np.sqrt(repeat_time):.3f}, TOP1. {np.mean(top1):.3f}, conf. {np.std(top1)*1.96/np.sqrt(repeat_time):.3f}, TOP2. {np.mean(top2):.3f}, conf. {np.std(top2)*1.96/np.sqrt(repeat_time):.3f}, TOP3. {np.mean(top3):.3f}, conf. {np.std(top3)*1.96/np.sqrt(repeat_time):.3f}, Matching. {np.mean(matching):.3f}, conf. {np.std(matching)*1.96/np.sqrt(repeat_time):.3f}, Multi. {np.mean(multi):.3f}, conf. {np.std(multi)*1.96/np.sqrt(repeat_time):.3f}"
+logger.info(msg_final)

MuseV/MMCM/mmcm/t2p/LICENSE ADDED Viewed

	@@ -0,0 +1,201 @@

+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+   1. Definitions.
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+   END OF TERMS AND CONDITIONS
+   APPENDIX: How to apply the Apache License to your work.
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+   Copyright 2023 tencent
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.