Project Awesome project awesome

Workflows > aws-pdf-textract-pipeline

ETL pipeline for crawling PDFs from the Web using Puppeteer and transforming their contents into structured data using AWS Textract and storing the results in DynamoDB.

Package 166 stars GitHub
Back to CDK