Remote
Posted 6 months ago

Remote opportunity with a Large Financial Client

  • Must have designed the E2E architecture of unified data platform covering all the aspect of data lifecycle starting from Data Ingestion, Transformation, Serve and consumption.
  • Must have excellent coding skills either Python or Scala, preferably Python.
  • Must have at least 10+ years of experience in Data Engineering domain with total of 12+ years.
  • Must have designed and implemented at least 2-3 project end-to-end in Databricks.
  • Must have at least 3+ years of experience on databricks which consists of various components as below
    • Delta lake
    • dbConnect
    • db API 2.0
    • SQL Endpoint – Photon engine
    • Unity Catalog
    • Databricks workflows orchestration
    • Security management
    • Platform governance
    • Data Security
  • Must have followed various architectural principles to design best suited per problem.
  • Must be well versed with Databricks Lakehouse concept and its implementation in enterprise environments.
  • Must have strong understanding of Data warehousing and various governance and security standards around Databricks.
  • Must have knowledge of cluster optimization and its integration with various cloud services.
  • Must have good understanding to create complex data pipeline.
  • Must be strong in SQL and sprak-sql.
  • Must have strong performance optimization skills to improve efficiency and reduce cost.
  • Must have worked on designing both Batch and streaming data pipeline.
  • Must have extensive knowledge of Spark and Hive data processing framework.  
  • Must have worked on any cloud (Azure, AWS, GCP) and most common services like ADLS/S3, ADF/Lambda, CosmosDB/DynamoDB, ASB/SQS, Cloud databases.
  • Must be strong in writing unit test case and integration test.
  • Must have strong communication skills and have worked with cross platform team.
  • Must have great attitude towards learning new skills and upskilling the existing skills.
  • Responsible to set best practices around Databricks CI/CD.
  • Must understand composable architecture to take fullest advantage of Databricks capabilities.
  • Good to have Rest API knowledge.
  • Good to have understanding around cost distribution.
  • Good to have if worked on migration project to build Unified data platform.
  • Good to have knowledge of DBT.
  • Experience around DevSecOps including docker and Kubernetes.
  • Software development full lifecycle methodologies, patterns, frameworks, libraries, and tools  
  • Knowledge of programming and scripting languages such as JavaScript, PowerShell, Bash, SQL,  Java, Python, etc.  
  • Experience with data ingestion technologies such as Azure Data Factory, SSIS, Pentaho, Alteryx  
  • Experience with visualization tools such as Tableau, Power BI
  • Experience with machine learning tools such as mlFlow, Databricks AI/ML, Azure ML, AWS sagemaker,  etc.  
  • Experience in distilling complex technical challenges to actionable decisions for stakeholders and guiding project teams by building consensus and mediating compromises when necessary.  
  • Experience coordinating the intersection of complex system dependencies and interactions  
  • Experience in solution delivery using common methodologies especially SAFe Agile but also Waterfall, Iterative, etc.  
  • Demonstrated knowledge of relevant industry trends and standards

Apply Online

A valid phone number is required.