Acceptable file (mime) types for media files

I haven’t found any useful details in the Arbil related documents about this topic. I have, however, found similar information in a LAMUS document at: http://www.mpi.nl/corpus/manuals/manual-lamus.pdf If there is more detailed info, please let me know. So, on to my problem:

I have movie files I would like to include into my corpus. The problem is that Arbil seems to only like the audio .WAV files, but none of the videos. According to the LAMUS document, .MPG, .MPEG, and .MP4, should all theoretically be ok. The problem is I am starting from an .MTS and a .MOV format (the only options in my two recorders). I have spent far too much time during the last week trying to work on remuxing and/or re-encoding the files. It seems nothing works…except when I cut out the video and re-encode only the audio into a .WAV format. This is no good, of course. So, I’m looking to see if anyone can give insight into two things:

  1. Is the LAMUS guide the same as the requirements for Arbil? (if not, what are the supported file/mime types)
  2. Can you provide suggestions about converting .MOV and .MTS files into something that is supported?

I plan to keep the originals, but would like to also have a supported file type for the archive.

Thanks in advance!

I would recommend the Tsunami TMPGEnc 5 encoding application. It will not only convert MTS and MOV files to MPI archive formats, but it also supports a whole range of other video formats. You can buy it here: http://tmpgenc.pegasys-inc.com/en/product/tvmw5.html

The settings for MTS too mpeg2 are:

Stream format: MPEG-2 Video
Profile & Level: MP@HL
Size: 1920 x 1080
Aspect ratio: Display 16:9
Framerate: 25 fps (for PAL) 29.97 (for NTSC)
Rate control mode: CBR
VBV buffer size: 224 kb
Video system: PAL (if PAL obviously)
DC component precision: 8 bit
Display mode: Progressive
Motion search precision: Standard
Output stream type: System (Video + Audio)
Bitrate / Quality: 25000

It is very important to select deinterlace always, as the application defaults to deinterlace when necessary

Unfortunately, I cannot use any stream format except MPEG-1. I’ve spent the better part of a day trying a short clip and going through the options. What I’ve found is that MPEG-2, MP4, and MKV do not work. No matter what I try, MPEG-2 will not work. According to the LAMUS manual, these formats should all be OK:

video/x-mpeg1 video/mpeg .mpg mpeg1
video/x-mpeg2 video/mpeg .mpeg mpeg2
video/mp4 video/mp4 .mp4 mpeg4, needs hinted track for streaming

But from my experience with Arbil, only the first one is accepted. Any ideas why I cannot get the second two stream formats to be accepted by Arbil would be greatly appreciated!

Any MPEG2 program (=system) stream that conforms to the standard should be accepted by Arbil, provided that the file extension is .mpeg

MPEG4 is a bit more complicated, the current stable Arbil only accepts MPEG4 files with H.264 video, AAC audio and a “hinting track” that are created with QuickTime. The testing and pre-testing versions of Arbil should also accept files that are created with ffmpeg (H.264 video and AAC audio, hinting track no longer needed).

Note that these file type restrictions are relevant mostly for people who archive their data in The Language Archive. If you are using Arbil just to organize your own data, you could choose to override the file type restrictions.

Thank you both for the information. I finally had luck with creating the MPEG2 stream. Not entirely sure what I was doing wrong before.

As for MPEG4, I opted to get QuickTime Pro, because that seems to be the most assured way that a “hinting track” can be added, while also converting to MPEG4. This has solved my problem and Arbil is accepting these files.

To answer your note, this is currently an unarchived project. However, I am in the field trip stages of a project to work on the grammar of this language. For that reason, I would like to get the workflow and constraints all dealt with early on.