Job Description:Perform web data mining, big data extraction from a variety of online sources.
Clean, transform, and validate data for use in analytics and machine learning applications.
Automate data pipelines and workflows using Python, PySpark, and tools like Apache Airflow.
Design and manage data warehouses, data lakes, and cloud- based storage solutions.
Job Responsibilities:Data Infrastructure: Build automated pipelines and cloud solutions. (e.g., AWS, GCP,…).
Data Integration and Management: Develop data warehouses and data lakes for optimal data storage and retrieval.
LLM Data Pipeline : Develop pipelines for Large Language Models (LLM), including RAG , LangChain, or LangGraph.
Big Data Mining: Extract and mine large- scale datasets from major e- commerce platforms in Vietnam, China, Korea, Southeast Asia,…
Data Processing: Clean, transform raw data into structured formats suitable for analytics and machine learning.
Data Visualization : Create visualizations and reports to communicate insights effectively.
Qualifications & Skills:Technical Skills:Proficiency in Python and data processing libraries (e.g., PySpark, Pandas).Experience with data mining tools and techniques (e.g., BeautifulSoup, Scrapy, Selenium).Understanding of data architecture concepts (data warehouses, data lakes, and cloud platforms).Familiarity with data pipeline tools (e.g., Apache Airflow) and cloud management.Familiarity with SQL for database management.Basic understanding of HTML, CSS, JavaScript, and web structures.Knowledge of LLM frameworks and tools (e.g., LangChain, LangGraph, RAG).
Education: Studying Computer Science, Data Science, Information Technology, or a related field.
Soft Skills:Strong problem- solving skills, attention to detail, and a passion for data engineering.Good communication skills in both Vietnamese and English.
Additional Information:Duration: 3- month internship.
Eligibility: Open to 3rd- year or final- year students.
Career Opportunity: Potential for full- time employment after the internship.
Allowance: 3- 4 million VND per month.
About us:Our visions are
We, ABC Studio (Ai Bigdata Content Studio), are an innovative Korea- Vietnam AI & Bigdata company, specialized in generative AI and Bigdata engineering especially for Market Intelligence and Vision graphics industry.
the leading innovative AI contents engineering company at movie (VFX) and webtoon and social marketing
the global best market data company, having global e- commerce and SNS bigdata
from Michael Kwak / CEO of ABC Studio
At the moment, we have partnerships with Korean companies in Webtoon, movie industry, fashion, cosmetic, food, digital marketing companys.
We are waiting for enthusiastic and talented interns who are willing to accompany our long and meaningful journey together.