Jobs / Razer / Data Extraction Engineer
chevron_leftBack
Data Extraction Engineer
Razer
placeChengdu
Posted on Razer website on 28 Feb 2025 (about 2 months ago)
Razer logo

Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put you in an accelerated growth, both personally and professionally.

Job Responsibilities/ 工作职责 :

Responsibilities:

  • Design, develop, and deploy web scraping solutions to collect specific datasets for AI training purposes.
  • Build robust and scalable web crawlers to extract structured and unstructured data from various online sources.
  • Ensure data accuracy, integrity, and compliance with relevant laws and regulations.
  • Clean, preprocess, and organize scraped data for use in machine learning models.
  • Monitor and optimize crawling performance to ensure efficiency and reliability.
  • Collaborate with AI teams to define data requirements and ensure the relevance of collected data.
  • Document crawling workflows, tools, and results for future reference.

Requirements:

  • Bachelor's or master’s degree in computer science, Software Engineering, or a related field.
  • Strong experience with web scraping tools and frameworks (e.g., Scrapy, Selenium, BeautifulSoup).
  • Proficiency in programming languages like Python, Java, or Node.js.
  • Familiarity with HTTP protocols, HTML parsing, and JSON data formats.
  • Knowledge of database systems (SQL, NoSQL) for data storage and management.
  • Experience with cloud platforms (e.g., AWS, GCP) and containerization tools (e.g., Docker).
  • Strong understanding of web crawling ethics, regulations, and best practices.
  • Excellent analytical skills and attention to detail.

Preferred Qualifications:

  • Experience with large-scale data scraping and handling distributed crawlers.
  • Familiarity with AI and machine learning concepts, especially data preprocessing for AI models.
  • Knowledge of browser automation and tools for rendering dynamic content.
  • Ability to handle multilingual data and diverse data formats.

岗位职责:

  • 设计、开发并部署网页爬虫解决方案,收集特定数据用于AI模型训练。
  • 构建稳健且可扩展的爬虫,提取结构化与非结构化数据。
  • 确保数据的准确性、完整性,并符合相关法律法规。
  • 对爬取的数据进行清理、预处理和组织,以便应用于机器学习模型。
  • 监控并优化爬虫性能,确保其高效可靠运行。
  • 与AI团队合作,明确数据需求,确保采集数据的相关性和价值。
  • 记录爬虫工作流、工具和结果,以便未来参考和改进。

岗位要求:

  • 计算机科学、软件工程或相关领域的学士或硕士学位。
  • 熟练掌握网页爬取工具与框架(如Scrapy、Selenium、BeautifulSoup)。
  • 熟悉Python、Java或Node.js等编程语言。
  • 熟悉HTTP协议、HTML解析和JSON数据格式。
  • 了解数据库系统(SQL、NoSQL)用于数据存储与管理。
  • 有云平台(如AWS、GCP)及容器化工具(如Docker)使用经验。
  • 深刻理解爬虫的伦理、法规及最佳实践。
  • 具备优秀的分析能力与细节关注度。

优先条件:

  • 有大规模数据爬取及分布式爬虫经验者优先。
  • 熟悉AI与机器学习概念,尤其是AI模型的数据预处理者优先。
  • 了解浏览器自动化及动态内容渲染工具者优先。
  • 能处理多语言数据及多样化数据格式者优先。

Pre-Requisites/ 任职要求 :

Are you game?

chevron_leftBack to Jobs
Razer logo
Razer Inc. is an American-Singaporean multinational corporation and technology company that makes, develops, and sells consumer electronics, financial services, and gaming hardware.
Websitelaunch
Careerslaunch