ASN wird nur bei schwarz weiß scan erkannt, nicht bei Farbe

Hallo,

ich bin soweit sehr zufrieden mit paperless. Habe dank Stefan alles super installieren können, inkl. Testinstanz, backup, alles fein. Ich nutze die Einstellungen gemäß Stefans config Datei.

An einer Sache verzweifel ich gerade. Ich habe kleine ASN Aufkleber mit Barcode. Die ASN wird nur zuverlässig erkannt, wenn ich einen schwarz/weiß Scan habe. Bei einem Farbscan 600 dpi, wird die ASN nie erkannt. Woran kann das liegen?

Habe bereits die alternative Engine ‚ZXING‘ getestet, auch keine Besserung. In der .yml habe ich folgende Einstellungen:

environment:
  PAPERLESS_REDIS: redis://broker:6379
  PAPERLESS_DBHOST: db
  PAPERLESS_CONSUMER_ENABLE_BARCODES: 1
  PAPERLESS_CONSUMER_ENABLE_ASN_BARCODE: true
  PAPERLESS_CONSUMER_BARCODE_SCANNER: 'ZXING' 

Hier ein Auszug aus dem SW Scan, ASN 7 wurde erkannt:

[2025-01-26 18:05:21,545] [DEBUG] [paperless.tasks] Skipping plugin CollatePlugin
[2025-01-26 18:05:21,546] [DEBUG] [paperless.tasks] Executing plugin BarcodePlugin
[2025-01-26 18:05:21,547] [DEBUG] [paperless.barcodes] Scanning for barcodes using ZXING
[2025-01-26 18:05:21,549] [DEBUG] [paperless.barcodes] PDF has 1 pages
[2025-01-26 18:05:21,549] [DEBUG] [paperless.barcodes] Processing page 0
[2025-01-26 18:05:23,870] [DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmp_osr2frb/barcodecnniubwb/bbe1aeb9-4f0a-44a8-848a-eaec37e0131c-1.ppm
[2025-01-26 18:05:24,301] [DEBUG] [paperless.barcodes] Barcode of type BarcodeFormat.Code128 found: ASN00007
[2025-01-26 18:05:24,301] [DEBUG] [paperless.barcodes] Barcode of type BarcodeFormat.DataMatrix found: 4174331014274
[2025-01-26 18:05:24,301] [DEBUG] [paperless.barcodes] Barcode of type BarcodeFormat.QRCode found: BCD
001
1
SCT
BIC
KORRESPONDENT UNKENNTLICH
IBAN
EUR

588880013286
[2025-01-26 18:05:24,302] [DEBUG] [paperless.barcodes] Barcode of type BarcodeFormat.DataMatrix found: 4174331014274019028010201000
[2025-01-26 18:05:24,308] [DEBUG] [paperless.barcodes] Found ASN Barcode: ASN00007
[2025-01-26 18:05:24,309] [INFO] [paperless.barcodes] Found ASN in barcode: 7
[2025-01-26 18:05:24,309] [INFO] [paperless.tasks] BarcodePlugin completed with no message
[2025-01-26 18:05:24,310] [DEBUG] [paperless.tasks] Executing plugin WorkflowTriggerPlugin
[2025-01-26 18:05:24,364] [INFO] [paperless.matching] Document matched WorkflowTrigger 3 from Workflow: Pfad mit Jahr
[2025-01-26 18:05:24,454] [INFO] [paperless.tasks] WorkflowTriggerPlugin completed with: Applying WorkflowAction 3 from Workflow: Pfad mit Jahr
[2025-01-26 18:05:24,455] [DEBUG] [paperless.tasks] Executing plugin ConsumeTaskPlugin
[2025-01-26 18:05:24,467] [INFO] [paperless.consumer] Consuming Test ASN SW.pdf
[2025-01-26 18:05:24,470] [DEBUG] [paperless.consumer] Detected mime type: application/pdf
[2025-01-26 18:05:24,480] [DEBUG] [paperless.consumer] Parser: RasterisedDocumentParser
[2025-01-26 18:05:24,485] [DEBUG] [paperless.consumer] Parsing Test ASN SW.pdf…
[2025-01-26 18:05:24,550] [INFO] [paperless.parsing.tesseract] pdftotext exited 0
[2025-01-26 18:05:25,561] [DEBUG] [paperless.parsing.tesseract] Calling OCRmyPDF with args: {‚input_file‘: PosixPath(‚/tmp/paperless/paperless-ngxac5wjgqt/Test ASN SW.pdf‘), ‚output_file‘: PosixPath(‚/tmp/paperless/paperless-2it0ks5t/archive.pdf‘), ‚use_threads‘: True, ‚jobs‘: 4, ‚language‘: ‚deu‘, ‚output_type‘: ‚pdfa‘, ‚progress_bar‘: False, ‚color_conversion_strategy‘: ‚RGB‘, ‚skip_text‘: True, ‚clean‘: True, ‚deskew‘: True, ‚rotate_pages‘: True, ‚rotate_pages_threshold‘: 12.0, ‚sidecar‘: PosixPath(‚/tmp/paperless/paperless-2it0ks5t/sidecar.txt‘)}
[2025-01-26 18:05:27,357] [INFO] [ocrmypdf._pipeline] skipping all processing on this page
[2025-01-26 18:05:27,362] [INFO] [ocrmypdf._pipelines.ocr] Postprocessing…
[2025-01-26 18:05:29,963] [WARNING] [ocrmypdf._metadata] Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF’s XMP metadata.
[2025-01-26 18:05:33,566] [INFO] [ocrmypdf._pipeline] Image optimization ratio: 1.00 savings: 0.0%
[2025-01-26 18:05:33,566] [INFO] [ocrmypdf._pipeline] Total file size ratio: 0.87 savings: -14.5%
[2025-01-26 18:05:33,572] [INFO] [ocrmypdf._pipelines._common] Output file is a PDF/A-2B (as expected)
[2025-01-26 18:05:33,813] [DEBUG] [paperless.parsing.tesseract] Incomplete sidecar file: discarding.
[2025-01-26 18:05:34,005] [INFO] [paperless.parsing.tesseract] pdftotext exited 0
[2025-01-26 18:05:34,007] [DEBUG] [paperless.consumer] Generating thumbnail for Test ASN SW.pdf…
[2025-01-26 18:05:34,011] [DEBUG] [paperless.parsing] Execute: convert -density 300 -scale 500x5000> -alpha remove -strip -auto-orient -define pdf:use-cropbox=true /tmp/paperless/paperless-2it0ks5t/archive.pdf[0] /tmp/paperless/paperless-2it0ks5t/convert.webp
[2025-01-26 18:05:36,455] [INFO] [paperless.parsing] convert exited 0
[2025-01-26 18:05:37,611] [DEBUG] [paperless.consumer] Saving record to database
[2025-01-26 18:05:37,611] [DEBUG] [paperless.consumer] Creation date from parse_date: 2024-10-14 00:00:00+02:00
[2025-01-26 18:05:38,421] [INFO] [paperless.handlers] Assigning document type Rechnung to 2024-10-14 Test ASN SW
[2025-01-26 18:05:38,589] [INFO] [paperless.matching] Document did not match Workflow: Titel automatisch erstellen initial und Tag scan setzen
[2025-01-26 18:05:38,589] [DEBUG] [paperless.matching] (‚Document filename Test ASN SW.pdf does not match scan*‘,)
[2025-01-26 18:05:38,628] [DEBUG] [paperless.consumer] Deleting file /tmp/paperless/paperless-ngxac5wjgqt/Test ASN SW.pdf
[2025-01-26 18:05:39,432] [DEBUG] [paperless.parsing.tesseract] Deleting directory /tmp/paperless/paperless-2it0ks5t
[2025-01-26 18:05:39,434] [INFO] [paperless.consumer] Document 2024-10-14 Test ASN SW consumption finished
[2025-01-26 18:05:39,440] [INFO] [paperless.tasks] ConsumeTaskPlugin completed with: Success. New document id 358 created

Hier ein Auszug aus dem log beim Farbscan, ASN wurde nicht erkannt:

[2025-01-26 18:05:40,801] [DEBUG] [paperless.tasks] Skipping plugin CollatePlugin
[2025-01-26 18:05:40,802] [DEBUG] [paperless.tasks] Executing plugin BarcodePlugin
[2025-01-26 18:05:40,802] [DEBUG] [paperless.barcodes] Scanning for barcodes using ZXING
[2025-01-26 18:05:40,805] [DEBUG] [paperless.barcodes] PDF has 1 pages
[2025-01-26 18:05:40,806] [DEBUG] [paperless.barcodes] Processing page 0
[2025-01-26 18:05:41,637] [DEBUG] [paperless.barcodes] Image is at /tmp/paperless/tmps1tiafq5/barcodegr8bt7du/9de71373-8fe9-4a5b-b6a2-7be0c6b20757-1.ppm
[2025-01-26 18:05:42,051] [DEBUG] [paperless.barcodes] Barcode of type BarcodeFormat.DataMatrix found: 4174331014274
[2025-01-26 18:05:42,052] [DEBUG] [paperless.barcodes] Barcode of type BarcodeFormat.QRCode found: BCD
001
1
SCT
BIC
KORRESPONDENT UNKENNTLICH
IBAN
EUR

588880013286
[2025-01-26 18:05:42,052] [DEBUG] [paperless.barcodes] Barcode of type BarcodeFormat.DataMatrix found: 4174331014274019028010201000
[2025-01-26 18:05:42,059] [INFO] [paperless.tasks] BarcodePlugin completed with no message
[2025-01-26 18:05:42,059] [DEBUG] [paperless.tasks] Executing plugin WorkflowTriggerPlugin
[2025-01-26 18:05:42,112] [INFO] [paperless.matching] Document matched WorkflowTrigger 3 from Workflow: Pfad mit Jahr
[2025-01-26 18:05:42,594] [INFO] [paperless.tasks] WorkflowTriggerPlugin completed with: Applying WorkflowAction 3 from Workflow: Pfad mit Jahr
[2025-01-26 18:05:42,595] [DEBUG] [paperless.tasks] Executing plugin ConsumeTaskPlugin
[2025-01-26 18:05:42,608] [INFO] [paperless.consumer] Consuming Test ASN Farbe.pdf
[2025-01-26 18:05:42,612] [DEBUG] [paperless.consumer] Detected mime type: application/pdf
[2025-01-26 18:05:42,622] [DEBUG] [paperless.consumer] Parser: RasterisedDocumentParser
[2025-01-26 18:05:42,627] [DEBUG] [paperless.consumer] Parsing Test ASN Farbe.pdf…
[2025-01-26 18:05:42,693] [INFO] [paperless.parsing.tesseract] pdftotext exited 0
[2025-01-26 18:05:42,864] [DEBUG] [paperless.parsing.tesseract] Calling OCRmyPDF with args: {‚input_file‘: PosixPath(‚/tmp/paperless/paperless-ngxpptiv5ek/Test ASN Farbe.pdf‘), ‚output_file‘: PosixPath(‚/tmp/paperless/paperless-6bg1q243/archive.pdf‘), ‚use_threads‘: True, ‚jobs‘: 4, ‚language‘: ‚deu‘, ‚output_type‘: ‚pdfa‘, ‚progress_bar‘: False, ‚color_conversion_strategy‘: ‚RGB‘, ‚skip_text‘: True, ‚clean‘: True, ‚deskew‘: True, ‚rotate_pages‘: True, ‚rotate_pages_threshold‘: 12.0, ‚sidecar‘: PosixPath(‚/tmp/paperless/paperless-6bg1q243/sidecar.txt‘)}
[2025-01-26 18:05:43,265] [INFO] [ocrmypdf._pipeline] skipping all processing on this page
[2025-01-26 18:05:43,273] [INFO] [ocrmypdf._pipelines.ocr] Postprocessing…
[2025-01-26 18:05:44,285] [WARNING] [ocrmypdf._metadata] Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF’s XMP metadata.
[2025-01-26 18:05:48,627] [INFO] [ocrmypdf._pipeline] Image optimization ratio: 1.23 savings: 18.6%
[2025-01-26 18:05:48,628] [INFO] [ocrmypdf._pipeline] Total file size ratio: 0.95 savings: -4.9%
[2025-01-26 18:05:48,634] [INFO] [ocrmypdf._pipelines._common] Output file is a PDF/A-2B (as expected)
[2025-01-26 18:05:51,274] [DEBUG] [paperless.parsing.tesseract] Incomplete sidecar file: discarding.
[2025-01-26 18:05:51,463] [INFO] [paperless.parsing.tesseract] pdftotext exited 0
[2025-01-26 18:05:51,465] [DEBUG] [paperless.consumer] Generating thumbnail for Test ASN Farbe.pdf…
[2025-01-26 18:05:51,469] [DEBUG] [paperless.parsing] Execute: convert -density 300 -scale 500x5000> -alpha remove -strip -auto-orient -define pdf:use-cropbox=true /tmp/paperless/paperless-6bg1q243/archive.pdf[0] /tmp/paperless/paperless-6bg1q243/convert.webp
[2025-01-26 18:05:54,138] [INFO] [paperless.parsing] convert exited 0
[2025-01-26 18:05:54,904] [DEBUG] [paperless.consumer] Saving record to database
[2025-01-26 18:05:54,904] [DEBUG] [paperless.consumer] Creation date from parse_date: 2024-10-16 00:00:00+02:00
[2025-01-26 18:05:55,719] [INFO] [paperless.handlers] Assigning document type Rechnung to 2024-10-16 Test ASN Farbe
[2025-01-26 18:05:55,870] [INFO] [paperless.matching] Document did not match Workflow: Titel automatisch erstellen initial und Tag scan setzen
[2025-01-26 18:05:55,870] [DEBUG] [paperless.matching] (‚Document filename Test ASN Farbe.pdf does not match scan*‘,)
[2025-01-26 18:05:55,914] [DEBUG] [paperless.consumer] Deleting file /tmp/paperless/paperless-ngxpptiv5ek/Test ASN Farbe.pdf
[2025-01-26 18:05:56,343] [DEBUG] [paperless.parsing.tesseract] Deleting directory /tmp/paperless/paperless-6bg1q243
[2025-01-26 18:05:56,344] [INFO] [paperless.consumer] Document 2024-10-16 Test ASN Farbe consumption finished
[2025-01-26 18:05:56,351] [INFO] [paperless.tasks] ConsumeTaskPlugin completed with: Success. New document id 359 created