Skip to content

Clarifications on annotation  #211

@MikeMpapa

Description

@MikeMpapa

Hi there,
I am working on a building a new dataset in Spanish (polysyllabic language). I have gone though MakeDiffSinger but I still have some gaps. I would be grateful if you could sanity check me on my understanding and share any thoughts you might have

Questions for clarifications:

  1. ph_seq: These are sequences of phonemes or syllables?
    Currently I using phonemes and their timestamps as provided by MFA. I am using a pre-trained Spanish model available by MFA. Would you recommend training a new one on my specific data?

  2. note_dur: The midi notes should be estimated over phonemes, syllables, or words?
    Now I estimated one note for each phoneme and assumed ph_dur==note_dure

  3. ph_num: The number of phonemes in each word or in each syllable?
    Now I assumed the number of phonemes in each word

  4. note_seq: Do you think SOME would suffice to get a first shot at this ? I would speculate yes?

  5. is_slur: how would you define slur in this context? I have not found plenty of resources on this topic
    Now I assumed no slurs at all

  6. SPs and APs: Would you recommend doing that manually or using the enhance script might be OK for a first shot?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions