About the Role:
We are looking for an experienced Python Developer with expertise in web scraping, automated portal interactions, and cloud-native deployment using AWS. The ideal candidate will have hands-on experience working with Playwright for browser automation, managing multi-factor authentication (MFA) flows, and deploying scalable scraping tasks via AWS Lambda and related services.
You will assist in architecting and building robust, secure, and scalable scraping solutions that interact with complex web applications and secured portals.
Key Responsibilities:
- Assist with the design and implementation of advanced scraping solutions using Python, Playwright, and AWS services.
- Automate interactions with JavaScript-heavy and authentication-secured websites, including handling MFA, CAPTCHAs, and session/token-based login flows.
- Architect scraping pipelines using serverless AWS components such as Lambda, Step Functions, S3, CloudWatch, and Secrets Manager.
- Build systems that scale to support high volumes of data extraction with fault tolerance, retries, and intelligent logging.
- Integrate and manage complex workflows across multiple portals, APIs, and data sources.
- Contribute to architectural decisions, tooling, and best practices.
Required Skills:
- 5+ years of experience in Python development, with a strong focus on automation and data extraction.
- Proven expertise in web scraping using tools like Playwright, Selenium, Scrapy, BeautifulSoup, and requests.
- In-depth experience handling multi-step authentication flows, including multi-factor authentication (MFA), CAPTCHA solving, and session/cookie management.
- Proficient in deploying and managing scraping workloads in AWS, particularly Lambda, S3, IAM, CloudWatch, and Secrets Manager.
- Experience with asynchronous programming, headless browsers, and JavaScript-rendered content.
- Solid understanding of web protocols (HTTP, HTTPS, cookies, headers), and ability to reverse-engineer network calls and authentication mechanisms.
- Comfortable working with APIs, JSON/XML, and data transformation.
Preferred Qualifications:
- Experience with CI/CD pipelines, Docker, and infrastructure as code (e.g., CloudFormation, Terraform).
- Familiarity with data pipeline orchestration tools such as Apache Airflow or AWS Step Functions.
- Prior experience scraping from secure or enterprise-level portals (e.g., financial, healthcare, legal).
- Background in data engineering or ETL workflows is a plus.
- Exposure to Python testing frameworks and writing unit/integration tests.
Education:
- Bachelor’s or Master’s degree in Computer Science, Software Engineering, or related field, or equivalent professional experience.
Find out more about Civicom Pacific at www.civi.com and our Feathers Project at www.feathersproject.org.