Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update file formats (DEV-1185) #2158

Merged
merged 4 commits into from Aug 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 9 additions & 9 deletions docs/01-introduction/file-formats.md
Expand Up @@ -9,15 +9,15 @@ Currently, only a limited number of file formats is accepted to be uploaded onto

The following table shows the accepted file formats:

| Category | Accepted format | Converted during ingest? |
| --------------------- |--------------------------------| -------------------------------------------------------------------------- |
| Text, XML<sup>1</sup> | TXT, XML, XSL, XSD | No |
| Tables | CSV, XLS, XLSX | No |
| 2D Images | JPG, JPEG, JP2, PNG, TIF, TIFF | Yes, converted to JPEG 2000 by [Sipi](https://github.com/dasch-swiss/sipi) |
| Audio | MPEG (MP3), MP4, WAV | No |
| Video | MP4 | No |
| Office | PDF, DOC, DOCX, PPT, PPTX | No |
| Archives | ZIP, TAR, ISO, GZ, GZIP, 7Z | No |
| Category | Accepted format | Converted during ingest? |
| --------------------- |----------------------------------------| -------------------------------------------------------------------------- |
| Text, XML<sup>1</sup> | TXT, XML, XSL, XSD | No |
| Tables | CSV, XLS, XLSX | No |
| 2D Images | JPG, JPEG, JP2, PNG, TIF, TIFF | Yes, converted to JPEG 2000 by [Sipi](https://github.com/dasch-swiss/sipi) |
| Audio | MPEG (MP3), MP4, WAV | No |
| Video | MP4 | No |
| Office | PDF, DOC, DOCX, PPT, PPTX | No |
| Archives | ZIP, TAR, GZ, Z, TAR.GZ, TGZ, GZIP, 7Z | No |


1: If your XML files represent text with markup (e.g. [TEI/XML](http://www.tei-c.org/)),
Expand Down
14 changes: 9 additions & 5 deletions sipi/scripts/file_info.lua
Expand Up @@ -39,9 +39,11 @@ local APPLICATION_PPT = "application/vnd.ms-powerpoint"
local APPLICATION_PPTX = "application/vnd.openxmlformats-officedocument.presentationml.presentation"
local APPLICATION_ZIP = "application/zip"
local APPLICATION_TAR = "application/x-tar"
local APPLICATION_ISO = "application/x-iso9660-image"
local APPLICATION_GZ = "application/gzip"
local APPLICATION_GZIP = "application/gzip"
local APPLICATION_7Z = "application/x-7z-compressed"
local APPLICATION_TGZ = "application/x-compress"
local APPLICATION_Z = "application/x-compress"
local VIDEO_MP4 = "video/mp4"


Expand Down Expand Up @@ -69,7 +71,6 @@ local text_mime_types = {

local document_mime_types = {
APPLICATION_PDF,
APPLICATION_ISO,
APPLICATION_DOC,
APPLICATION_DOCX,
APPLICATION_XLS,
Expand All @@ -82,7 +83,9 @@ local archive_mime_types = {
APPLICATION_TAR,
APPLICATION_ZIP,
APPLICATION_GZIP,
APPLICATION_7Z
APPLICATION_7Z,
APPLICATION_TGZ,
APPLICATION_Z
}

local video_mime_types = {
Expand All @@ -105,7 +108,6 @@ local text_extensions = {

local document_extensions = {
"pdf",
"iso",
"doc",
"docx",
"xls",
Expand All @@ -119,7 +121,9 @@ local archive_extensions = {
"tar",
"gz",
"gzip",
"7z"
"7z",
"tgz",
"z"
}

local video_extensions = {
Expand Down
4 changes: 2 additions & 2 deletions webapi/src/main/resources/application.conf
Expand Up @@ -323,7 +323,6 @@ app {
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"application/vnd.ms-powerpoint",
"application/vnd.openxmlformats-officedocument.presentationml.presentation",
"application/x-iso9660-image",
]
text-mime-types = ["application/xml", "text/xml", "text/csv", "text/plain"]
video-mime-types = ["video/mp4"]
Expand All @@ -332,7 +331,8 @@ app {
"application/zip",
"application/x-tar",
"application/gzip",
"application/x-7z-compressed"
"application/x-7z-compressed",
"application/x-compress"
]
}

Expand Down