Data Scientist, AGI Autonomy Human Feedback
San Francisco, California, USA
Amazon.com
Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa...
Our team’s mission is to build the world’s most useful agent, and we’re looking for a Data Engineer to build the pipelines and tools for collecting and analyzing a wide range of human data. You’ll work alongside world class AI researchers, engineers, and program managers to identify and implement the best processes for human data collection. This role is highly cross-functional, leveraging skills across data science, machine learning engineering, and project management to ensure our team collects the most effective agentic training data in a rapidly-evolving technological environment.
Key job responsibilities
* Work closely with researchers engineers to create robust data pipelines and data collection tools.
* Work closely with program managers to optimize data collection processes.
* Simplify and enhance the accessibility, clarity, and usability of large or complex datasets through the development of advanced dashboards and applications.
* Take ownership of the design, creation, and upkeep of metrics, reports, analyses, and dashboards to inform data collection projects.
* Develop and manage scalable, automated, and fault-tolerant data solutions using cutting-edge technologies such as Spark, EMR, Python, Redshift, Glue, and S3.
* Continually improve ongoing reporting and analysis processes, automating or simplifying self-service support for datasets.
* 3+ years of data engineering experience
* 1+ years of program or project management experience
* Proficient in SQL
* Experience with data modeling, warehousing and building ETL pipelines
* Experience using data and metrics to determine and drive improvements
* Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
* Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
* Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Key job responsibilities
* Work closely with researchers engineers to create robust data pipelines and data collection tools.
* Work closely with program managers to optimize data collection processes.
* Simplify and enhance the accessibility, clarity, and usability of large or complex datasets through the development of advanced dashboards and applications.
* Take ownership of the design, creation, and upkeep of metrics, reports, analyses, and dashboards to inform data collection projects.
* Develop and manage scalable, automated, and fault-tolerant data solutions using cutting-edge technologies such as Spark, EMR, Python, Redshift, Glue, and S3.
* Continually improve ongoing reporting and analysis processes, automating or simplifying self-service support for datasets.
Basic Qualifications
* 3+ years of data engineering experience
* 1+ years of program or project management experience
* Proficient in SQL
* Experience with data modeling, warehousing and building ETL pipelines
* Experience using data and metrics to determine and drive improvements
* Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
Preferred Qualifications
* Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions* Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
* Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
0
0
0
Category:
Data Science Jobs
Tags: AGI AWS AWS Glue Big Data Data pipelines Engineering ETL Firehose Hadoop Java Kinesis Lambda Machine Learning Node.js Pipelines Python RDBMS Redshift Scala Spark SQL
Perks/benefits: Career development
Region:
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Sr. Data Engineer jobsData Scientist II jobsStaff Data Scientist jobsBI Developer jobsStaff Machine Learning Engineer jobsPrincipal Data Engineer jobsData Manager jobsSenior AI Engineer jobsJunior Data Analyst jobsData Science Intern jobsData Science Manager jobsResearch Scientist jobsBusiness Data Analyst jobsPrincipal Software Engineer jobsData Specialist jobsLead Data Analyst jobsSoftware Engineer II jobsData Analyst Intern jobsSr. Data Scientist jobsData Engineer III jobsBI Analyst jobsJunior Data Engineer jobsDevOps Engineer jobsSoftware Engineer, Machine Learning jobsAI/ML Engineer jobs
Snowflake jobsEconomics jobsLinux jobsOpen Source jobsData Warehousing jobsComputer Vision jobsMLOps jobsGoogle Cloud jobsAirflow jobsNoSQL jobsRDBMS jobsKafka jobsBanking jobsHadoop jobsJavaScript jobsClassification jobsScala jobsScikit-learn jobsPhysics jobsKPIs jobsData warehouse jobsOracle jobsTerraform jobsStreaming jobsGitHub jobs
PostgreSQL jobsScrum jobsPySpark jobsR&D jobsLooker jobsPandas jobsSAS jobsCX jobsBigQuery jobsData Mining jobsDistributed Systems jobsJira jobsdbt jobsRobotics jobsIndustrial jobsRedshift jobsUnstructured data jobsReact jobsMicroservices jobsJenkins jobsData strategy jobsNumPy jobsE-commerce jobsPharma jobsGPT jobs