-
- Downloads
Merge branch 'corpus-processing' into 'main'
Corpus processing, statistics, and title detection improvement See merge request !3
Showing
- .gitignore 3 additions, 0 deletions.gitignore
- pdfstruct/collection.py 24 additions, 3 deletionspdfstruct/collection.py
- pdfstruct/corpus.py 141 additions, 0 deletionspdfstruct/corpus.py
- pdfstruct/corpus_processing.py 248 additions, 40 deletionspdfstruct/corpus_processing.py
- pdfstruct/document.py 352 additions, 86 deletionspdfstruct/document.py
- pdfstruct/line.py 138 additions, 30 deletionspdfstruct/line.py
- pdfstruct/logical_section.py 45 additions, 29 deletionspdfstruct/logical_section.py
- pdfstruct/marker.py 22 additions, 16 deletionspdfstruct/marker.py
- pdfstruct/numbering.py 13 additions, 2 deletionspdfstruct/numbering.py
- pdfstruct/page.py 226 additions, 52 deletionspdfstruct/page.py
- pdfstruct/paragraph.py 8 additions, 13 deletionspdfstruct/paragraph.py
- pdfstruct/patterns.py 16 additions, 3 deletionspdfstruct/patterns.py
- pdfstruct/structure.py 11 additions, 5 deletionspdfstruct/structure.py
- pdfstruct/utils.py 6 additions, 0 deletionspdfstruct/utils.py
- play.ipynb 327 additions, 77 deletionsplay.ipynb
- poetry.lock 822 additions, 783 deletionspoetry.lock
- pyproject.toml 3 additions, 1 deletionpyproject.toml
Loading
Please register or sign in to comment