Hey Sarah,
I actually did have another ingest fail today on both file format identification and characterization. Some, like jpgs and tifs, I don't think should have any trouble though there are some other unknown type files mixed in because they were given to us without extensions I guess.
File Identification Failure examples that actually successfully passed characterization:
1409_George_Thudichum_copy_.jpg
Exit Code: 255
Standard Output
IDCommand: Identify using Siegfried 1.7.10
IDCommand UUID: 75290b14-2931-455f-bdde-3b4b3f8b7f15
IDTool: Siegfried
IDTool UUID: 454df69d-5cc0-49fc-93e4-6fbb6ac659e7
File: (0c3d34a1-c663-4adc-9fa8-6c566003ad35) /var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/selectFormatIDToolTransfer/SABIOM8-8c6edd45-d758-498e-aee2-1a5f81e33c48/objects/Disc_2/1409_George_Thudichum_copy_.jpg
Errors and diagnosis
Error: IDCommand with UUID 75290b14-2931-455f-bdde-3b4b3f8b7f15 exited non-zero.
Error: siegfried determined that the file format is UNKNOWN
Desktop_DF
Exit code: 255
Standard output (stdout)
IDCommand: Identify using Siegfried 1.7.10
IDCommand UUID: 75290b14-2931-455f-bdde-3b4b3f8b7f15
IDTool: Siegfried
IDTool UUID: 454df69d-5cc0-49fc-93e4-6fbb6ac659e7
File: (5fee15d8-40d7-492a-ab99-32e9dcee9f6f) /var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/selectFormatIDToolTransfer/SABIOM8-8c6edd45-d758-498e-aee2-1a5f81e33c48/objects/Disc_1/Desktop_DF
Errors and diagnostics (stderr)
Error: IDCommand with UUID 75290b14-2931-455f-bdde-3b4b3f8b7f15 exited non-zero.
Error: siegfried determined that the file format is UNKNOWN
1408_Marie_Thudichum.tif
Exit code: 255
Standard output (stdout)
IDCommand: Identify using Siegfried 1.7.10
IDCommand UUID: 75290b14-2931-455f-bdde-3b4b3f8b7f15
IDTool: Siegfried
IDTool UUID: 454df69d-5cc0-49fc-93e4-6fbb6ac659e7
File: (d749544e-834e-470f-87eb-6bacb00f353b) /var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/selectFormatIDToolTransfer/SABIOM8-8c6edd45-d758-498e-aee2-1a5f81e33c48/objects/Disc_2/1408_Marie_Thudichum.tif
Errors and diagnostics (stderr)
Error: IDCommand with UUID 75290b14-2931-455f-bdde-3b4b3f8b7f15 exited non-zero.
Error: siegfried determined that the file format is UNKNOWN
And then items in the same ingest that passed file identification but failed characterization:
1408_Marie_Thudichum.jpg
Standard output (stdout)
IDCommand: Identify using Siegfried 1.7.10
IDCommand UUID: 75290b14-2931-455f-bdde-3b4b3f8b7f15
IDTool: Siegfried
IDTool UUID: 454df69d-5cc0-49fc-93e4-6fbb6ac659e7
File: (70665056-56f2-46ac-966a-da3df6d8c748) /var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/selectFormatIDToolTransfer/SABIOM8-8c6edd45-d758-498e-aee2-1a5f81e33c48/objects/Disc_2/1408_Marie_Thudichum.jpg
Command output: fmt/41
/var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/selectFormatIDToolTransfer/SABIOM8-8c6edd45-d758-498e-aee2-1a5f81e33c48/objects/Disc_2/1408_Marie_Thudichum.jpg identified as a Raw JPEG Stream
Errors and diagnostics (stderr)
get() returned more than one FormatVersion -- it returned 2!Traceback (most recent call last):
File "/src/MCPClient/lib/job.py", line 111, in JobContext
yield
File "/src/MCPClient/lib/clientScripts/characterize_file.py", line 131, in call
job.set_status(main(job, *job.args[1:]))
File "/src/MCPClient/lib/clientScripts/characterize_file.py", line 46, in main
format = FormatVersion.active.get(fileformatversion__file_uuid=file_uuid)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/manager.py", line 127, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 338, in get
(self.model._meta.object_name, num)
MultipleObjectsReturned: get() returned more than one FormatVersion -- it returned 2!
1409_George_Thudichum
Standard output (stdout)
IDCommand: Identify using Siegfried 1.7.10
IDCommand UUID: 75290b14-2931-455f-bdde-3b4b3f8b7f15
IDTool: Siegfried
IDTool UUID: 454df69d-5cc0-49fc-93e4-6fbb6ac659e7
File: (2d6b144d-3add-43a0-b0d1-9c83d0e0f3ce) /var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/selectFormatIDToolTransfer/SABIOM8-8c6edd45-d758-498e-aee2-1a5f81e33c48/objects/Disc_1/1409_George_Thudichum
Command output: fmt/353
/var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/selectFormatIDToolTransfer/SABIOM8-8c6edd45-d758-498e-aee2-1a5f81e33c48/objects/Disc_1/1409_George_Thudichum identified as a TIFF
Errors and diagnostics (stderr)
get() returned more than one FormatVersion -- it returned 2!Traceback (most recent call last):
File "/src/MCPClient/lib/job.py", line 111, in JobContext
yield
File "/src/MCPClient/lib/clientScripts/characterize_file.py", line 131, in call
job.set_status(main(job, *job.args[1:]))
File "/src/MCPClient/lib/clientScripts/characterize_file.py", line 46, in main
format = FormatVersion.active.get(fileformatversion__file_uuid=file_uuid)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/manager.py", line 127, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 338, in get
(self.model._meta.object_name, num)
MultipleObjectsReturned: get() returned more than one FormatVersion -- it returned 2!
1416__Thudichum.jpg
Standard output (stdout)
IDCommand: Identify using Siegfried 1.7.10
IDCommand UUID: 75290b14-2931-455f-bdde-3b4b3f8b7f15
IDTool: Siegfried
IDTool UUID: 454df69d-5cc0-49fc-93e4-6fbb6ac659e7
File: (d6b5e0a5-7062-43de-8b6f-aa43d0ab45b1) /var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/selectFormatIDToolTransfer/SABIOM8-8c6edd45-d758-498e-aee2-1a5f81e33c48/objects/Disc_2/1416__Thudichum.jpg
Command output: fmt/41
/var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/selectFormatIDToolTransfer/SABIOM8-8c6edd45-d758-498e-aee2-1a5f81e33c48/objects/Disc_2/1416__Thudichum.jpg identified as a Raw JPEG Stream
Errors and diagnostics (stderr)
get() returned more than one FormatVersion -- it returned 2!Traceback (most recent call last):
File "/src/MCPClient/lib/job.py", line 111, in JobContext
yield
File "/src/MCPClient/lib/clientScripts/characterize_file.py", line 131, in call
job.set_status(main(job, *job.args[1:]))
File "/src/MCPClient/lib/clientScripts/characterize_file.py", line 46, in main
format = FormatVersion.active.get(fileformatversion__file_uuid=file_uuid)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/manager.py", line 127, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 338, in get
(self.model._meta.object_name, num)
MultipleObjectsReturned: get() returned more than one FormatVersion -- it returned 2!
I hope that makes some sort of sense to you because it really doesn't make any sense to me. I've put through a few other ingests today without these file or characterization errors so I'm not sure what the root cause is. It would definitely make more sense and be easier to track down the error if it was happening every time, to every file, in every ingest but it's not so far.
Best,
Ashley