bytec
19. Oktober 2025 um 19:05
1
Hallo Zusammen,
ich würde gerne Paperless so einstellen das egal was für eine PDF hochgeladen wird mit OCRmypdf ein neuer OCR durchgeführt wird.
Das sollte doch eigentlich mit PAPERLESS_OCR_MODE=redo oder force erledigt sein. Aber ich sitze jetzt schon den halben Tag an der Sache. Egal was ich einstelle in der paperless.conf oder der WebUI ich bekomme immer im log [DEBUG] [paperless.parsing.tesseract] Document has text, skipping OCRmyPDF entirely.
Paperless ist in einem LXC installiert und läuft sonst problemlos. Die paperless.conf wird auch ausgelesen das habe ich getestet indem ich ein falsches DB Passwort dort eingetragen habe danach lief Paperless nicht mehr. Also wird die conf richtig gelesen.
Ich bin echt mit meinem Latein am Ende. Vielleicht habt ihr eine Idee.
Gruß
Zeig mal deine komplette Config Datei.
bytec
21. Oktober 2025 um 17:43
3
# Have a look at the docs for documentation.
# https://docs.paperless-ngx.com/configuration/
# Debug. Only enable this for development.
#PAPERLESS_DEBUG=false
# Required services
PAPERLESS_REDIS=redis://localhost:6379
PAPERLESS_DBHOST=localhost
PAPERLESS_DBPORT=5432
PAPERLESS_DBNAME=paperlessdb
PAPERLESS_DBUSER=paperless
PAPERLESS_DBPASS=xxxxxxx
#PAPERLESS_DBSSLMODE=prefer
# Paths and folders
PAPERLESS_CONSUMPTION_DIR=/opt/paperless/consume
PAPERLESS_DATA_DIR=/opt/paperless/data
#PAPERLESS_EMPTY_TRASH_DIR=
PAPERLESS_MEDIA_ROOT=/opt/paperless/media
PAPERLESS_STATICDIR=/opt/paperless/static
#PAPERLESS_FILENAME_FORMAT=
#PAPERLESS_FILENAME_FORMAT_REMOVE_NONE=
# Security and hosting
PAPERLESS_SECRET_KEY=xxxx
PAPERLESS_URL=https://xxxx
#PAPERLESS_CSRF_TRUSTED_ORIGINS=https://example.com # can be set using PAPERLESS_URL
#PAPERLESS_ALLOWED_HOSTS=example.com,www.example.com # can be set using PAPERLESS_URL
#PAPERLESS_CORS_ALLOWED_HOSTS=https://localhost:8080,https://example.com # can be set using PAPERLESS_URL
#PAPERLESS_FORCE_SCRIPT_NAME=
#PAPERLESS_STATIC_URL=/static/
#PAPERLESS_AUTO_LOGIN_USERNAME=
#PAPERLESS_COOKIE_PREFIX=
#PAPERLESS_ENABLE_HTTP_REMOTE_USER=false
# OCR settings
PAPERLESS_OCR_LANGUAGE=deu
#PAPERLESS_OCR_MODE=redo
PAPERLESS_OCR_SKIP_ARCHIVE_FILE=always
#PAPERLESS_OCR_OUTPUT_TYPE=pdfa
#PAPERLESS_OCR_PAGES=1
#PAPERLESS_OCR_IMAGE_DPI=300
PAPERLESS_OCR_CLEAN=clean
PAPERLESS_OCR_DESKEW=true
#PAPERLESS_OCR_ROTATE_PAGES=true
#PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD=12.0
#PAPERLESS_OCR_USER_ARGS={}
#PAPERLESS_CONVERT_MEMORY_LIMIT=0
#PAPERLESS_CONVERT_TMPDIR=/var/tmp/paperless
PAPERLESS_OCR_MAX_IMAGE_PIXELS=5000000000
# Software tweaks
PAPERLESS_TASK_WORKERS=8
#PAPERLESS_THREADS_PER_WORKER=8
#PAPERLESS_TIME_ZONE=UTC
#PAPERLESS_CONSUMER_POLLING=10
#PAPERLESS_CONSUMER_DELETE_DUPLICATES=false
#PAPERLESS_CONSUMER_RECURSIVE=false
#PAPERLESS_CONSUMER_IGNORE_PATTERNS=[".DS_STORE/*", "._*", ".stfolder/*", ".stversions/*", ".localized/*", "desktop.ini"]
#PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=false
#PAPERLESS_CONSUMER_ENABLE_BARCODES=false
#PAPERLESS_CONSUMER_BARCODE_STRING=PATCHT
#PAPERLESS_CONSUMER_BARCODE_UPSCALE=0.0
#PAPERLESS_CONSUMER_BARCODE_DPI=300
#PAPERLESS_CONSUMER_ENABLE_TAG_BARCODE=false
#PAPERLESS_CONSUMER_TAG_BARCODE_MAPPING={"TAG:(.*)": "\\g<1>"}
#PAPERLESS_CONSUMER_ENABLE_COLLATE_DOUBLE_SIDED=false
#PAPERLESS_CONSUMER_COLLATE_DOUBLE_SIDED_SUBDIR_NAME=double-sided
#PAPERLESS_CONSUMER_COLLATE_DOUBLE_SIDED_TIFF_SUPPORT=false
#PAPERLESS_PRE_CONSUME_SCRIPT=/path/to/an/arbitrary/script.sh
#PAPERLESS_POST_CONSUME_SCRIPT=/path/to/an/arbitrary/script.sh
#PAPERLESS_FILENAME_DATE_ORDER=YMD
#PAPERLESS_FILENAME_PARSE_TRANSFORMS=[]
#PAPERLESS_NUMBER_OF_SUGGESTED_DATES=5
#PAPERLESS_THUMBNAIL_FONT_NAME=
#PAPERLESS_IGNORE_DATES=
#PAPERLESS_ENABLE_UPDATE_CHECK=
PAPERLESS_CONSUMER_INOTIFY_DELAY=1
# Tika settings
#PAPERLESS_TIKA_ENABLED=false
#PAPERLESS_TIKA_ENDPOINT=http://localhost:9998
#PAPERLESS_TIKA_GOTENBERG_ENDPOINT=http://localhost:3000
# Binaries
#PAPERLESS_CONVERT_BINARY=/usr/bin/convert
#PAPERLESS_GS_BINARY=/usr/bin/gs
PAPERLESS_EMAIL_HOST=xxxx
PAPERLESS_EMAIL_PORT=465
PAPERLESS_EMAIL_HOST_USER=xxxx
PAPERLESS_EMAIL_HOST_PASSWORD=xxx
PAPERLESS_EMAIL_FROM=xxxx
PAPERLESS_EMAIL_USE_TLS=false
PAPERLESS_EMAIL_USE_SSL=true
#PAPERLESS_CONSUMER_ENABLE_BARCODES=true # enable search for barcodes
#PAPERLESS_CONSUMER_ENABLE_ASN_BARCODE=true # enable setting ASN by ASN barcodes
#PAPERLESS_CONSUMER_BARCODE_SCANNER=ZXING # switch from pyzbar to zxing for better recognition
#PAPERLESS_CONSUMER_BARCODE_UPSCALE=1.5
#PAPERLESS_CONSUMER_BARCODE_DPI=600
#PAPERLESS_WORKER_TIMEOUT=10000
bytec:
#PAPERLESS_OCR_MODE=redo
Das # vor dem relevanten Punkt PAPERLESS_OCR_MODE hast du nur zum debuggen drin oder steht das die ganze Zeit da und macht die Konfigurationszeile inaktiv?
bytec
22. Oktober 2025 um 04:45
5
Nein war zum debuggen. Hatte REDO mal in der WebUI eingetragen was aber auch nichts geändert hat. Mir ist das echt ein Rätsel.