<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The Long Game in Data Engineering]]></title><description><![CDATA[Upskilling data professionals for cloud data engineering roles, interviews, and long-term career growth.]]></description><link>https://sachinchandrashekhar.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png</url><title>The Long Game in Data Engineering</title><link>https://sachinchandrashekhar.substack.com</link></image><generator>Substack</generator><lastBuildDate>Sat, 09 May 2026 09:37:30 GMT</lastBuildDate><atom:link href="https://sachinchandrashekhar.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Sachin Chandrashekhar]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[sachinchandrashekhar@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[sachinchandrashekhar@substack.com]]></itunes:email><itunes:name><![CDATA[Sachin Chandrashekhar]]></itunes:name></itunes:owner><itunes:author><![CDATA[Sachin Chandrashekhar]]></itunes:author><googleplay:owner><![CDATA[sachinchandrashekhar@substack.com]]></googleplay:owner><googleplay:email><![CDATA[sachinchandrashekhar@substack.com]]></googleplay:email><googleplay:author><![CDATA[Sachin Chandrashekhar]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[From No Experience to Startup Opportunity: How One International Student Used Projects, GRA, and Wellfound to Create His Own Breakthrough]]></title><description><![CDATA[Sometimes the breakthrough does not begin with a perfect job.]]></description><link>https://sachinchandrashekhar.substack.com/p/from-no-experience-to-startup-opportunity</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/from-no-experience-to-startup-opportunity</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Fri, 08 May 2026 23:20:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Sometimes&#8230;</p><p>It begins with clarity.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Recently, I spoke to one of my students, Nischal.</p><p>He came to the U.S. as a master&#8217;s student.</p><p>No major corporate experience.<br>No deep understanding of Data Engineering initially.<br>No perfectly mapped career blueprint.</p><p>Like many students&#8230;</p><p>He simply wanted an opportunity.</p><p>But what stood out was not luck.</p><p>It was how he gradually positioned himself.</p><p>And his story carries an important lesson for many aspiring Data Engineers, especially international students trying to break into the U.S. market.</p><div><hr></div><h1>The Reality Many Students Face</h1><p>For many international students, the challenge is rarely just &#8220;learning technology.&#8221;</p><p>The bigger challenge is:</p><p><strong>How do I build enough credibility for someone to take a chance on me?</strong></p><p>Because often:</p><ul><li><p>You may not have prior U.S. work experience</p></li><li><p>You may not yet know how the corporate ecosystem works</p></li><li><p>You may be competing against experienced professionals</p></li><li><p>Visa sponsorship may heavily influence career decisions</p></li></ul><p>This is where many people feel stuck.</p><p>They keep learning&#8230;</p><p>But struggle to convert learning into opportunity.</p><div><hr></div><h1>What Changed for Nischal?</h1><p>Nischal&#8217;s transformation did not happen because he randomly applied to hundreds of jobs.</p><p>It also did not happen because he consumed random tutorials without structure.</p><p>A major turning point in his journey was gaining roadmap clarity through structured learning.</p><p>Before that&#8230;</p><p>Like many aspiring professionals&#8230;</p><p>He knew technology terms existed.</p><p>Lambda.<br>AWS.<br>Cloud.</p><p>But he did not yet fully understand:</p><ul><li><p>Which service solves what problem</p></li><li><p>When to use specific tools</p></li><li><p>How real-world architecture decisions are made</p></li><li><p>How Data Engineering actually functions as an industry role</p></li></ul><p>This is an important distinction.</p><p>Because many learners today are exposed to information&#8230;</p><p>But exposure is not the same as clarity.</p><p>Through the course, one of the biggest advantages he repeatedly highlighted was not just &#8220;content.&#8221;</p><p>It was:</p><h2>Structured understanding.</h2><p>He specifically spoke about how the program helped him:</p><ul><li><p>Understand AWS from a Data Engineering lens</p></li><li><p>Connect services to practical use cases</p></li><li><p>Think more like industry</p></li><li><p>Build confidence in discussing architecture</p></li><li><p>Strengthen fundamentals before deeper specialization</p></li></ul><p>In simple words:</p><p>The course helped convert scattered awareness into usable direction.</p><p>And once that direction became clearer&#8230;</p><p>He was able to build <strong>evidence</strong>.</p><h2>1. He Built Real Projects</h2><p>This was one of the biggest turning points.</p><p>He built:</p><ul><li><p>AWS-based cloud projects</p></li><li><p>GRA (Graduate Research Assistantship) projects</p></li><li><p>Databricks projects</p></li><li><p>Real demos with GitHub-backed implementation</p></li></ul><p>During interviews&#8230;</p><p>They did not just ask him generic questions.</p><p>They specifically asked about his projects.</p><p>He was able to explain:</p><ul><li><p>Why he chose certain AWS services</p></li><li><p>How data flowed through his architecture</p></li><li><p>How his solutions worked end-to-end</p></li><li><p>What decisions he made</p></li></ul><p>That matters.</p><p>Because projects move you from:</p><p><strong>&#8220;I learned this&#8221;</strong></p><p>To:</p><p><strong>&#8220;I built this.&#8221;</strong></p><p>And that difference is massive.</p><div><hr></div><h1>The Hidden Advantage of GRA Roles</h1><p>One major insight from his journey:</p><p>His Graduate Research Assistantship became more than just campus work.</p><p>It became resume credibility.</p><p>He worked on:</p><ul><li><p>Real AWS architecture</p></li><li><p>Lambda</p></li><li><p>S3</p></li><li><p>ETL-like systems</p></li><li><p>Dashboarding</p></li><li><p>Data workflows</p></li></ul><p>This gave him something many students lack:</p><h2>Experience that can be explained.</h2><p>Not theoretical awareness.</p><p>Practical, discussable implementation.</p><p>For students&#8230;</p><p>This is powerful.</p><p>Sometimes your first big opportunity may not come directly from a dream company.</p><p>It may come from:</p><ul><li><p>University labs</p></li><li><p>Research roles</p></li><li><p>Startups</p></li><li><p>Smaller but meaningful builds</p></li></ul><p>And these can become stepping stones.</p><div><hr></div><h1>The Wellfound Insight: A Startup Marketplace More People Should Know About</h1><p>One of the most practical parts of Nischal&#8217;s story was how he discovered his startup opportunity.</p><p>He used <strong>Wellfound</strong> (formerly AngelList Talent).</p><h2>What is Wellfound?</h2><p>Wellfound is essentially a startup-focused job marketplace where:</p><ul><li><p>Startups post opportunities</p></li><li><p>Job seekers create detailed profiles</p></li><li><p>Founders and hiring teams evaluate fit faster</p></li><li><p>Applications can feel more direct than traditional platforms</p></li></ul><p>Unlike the black-hole feeling many experience on large job boards&#8230;</p><p>Nischal found that Wellfound gave quicker visibility into:</p><ul><li><p>Whether he was moving forward</p></li><li><p>Whether he was rejected</p></li><li><p>Which startups aligned with his profile</p></li></ul><h2>Why this matters:</h2><p>Startups often care deeply about:</p><ul><li><p>Execution ability</p></li><li><p>Ownership mindset</p></li><li><p>Adaptability</p></li><li><p>Problem-solving</p></li></ul><p>Sometimes&#8230;</p><p>A strong project portfolio can stand out more here than polished corporate history.</p><p>That does not mean it is easier.</p><p>But it can mean:</p><p><strong>Different pathways exist.</strong></p><div><hr></div><h1>A Powerful Career Lesson: Sometimes the &#8220;Perfect&#8221; Role Is Not the First Door</h1><p>Interestingly&#8230;</p><p>Nischal initially interviewed for something closer to product QA.</p><p>But during the process&#8230;</p><p>He communicated his deeper interest in Data Engineering and specific workflows.</p><p>That matters.</p><p>Because many professionals underestimate this:</p><h2>Positioning matters.</h2><p>Sometimes opportunities evolve when you demonstrate:</p><ul><li><p>Technical interest</p></li><li><p>Project relevance</p></li><li><p>Curiosity</p></li><li><p>Initiative</p></li></ul><p>He did not passively accept a label.</p><p>He communicated where he could create value.</p><div><hr></div><h1>The Visa Sponsorship Reality</h1><p>This was another deeply practical part of our conversation.</p><p>He received:</p><h2>Option A:</h2><p>Startup opportunity with equity now + potential visa sponsorship after funding</p><h2>Option B:</h2><p>Paid AI internship, but sponsorship uncertain</p><p>This is where career strategy becomes personal.</p><p>For international students&#8230;</p><p>Short-term salary is important.</p><p>But immigration stability may sometimes matter more.</p><p>This does not mean everyone should choose unpaid roles.</p><p>But it does highlight:</p><h2>Career decisions are not always about immediate income alone.</h2><p>Sometimes:</p><ul><li><p>Sponsorship pathway</p></li><li><p>Domain alignment</p></li><li><p>Long-term positioning</p></li><li><p>Industry entry point</p></li></ul><p>&#8230;may carry larger strategic weight.</p><div><hr></div><h1>What His Story Reinforced</h1><p>If I had to simplify his journey:</p><h2>Fundamentals:</h2><ul><li><p>SQL</p></li><li><p>Python</p></li><li><p>Cloud (AWS)</p></li></ul><h2>Execution:</h2><ul><li><p>Projects</p></li><li><p>GRA</p></li><li><p>Databricks</p></li><li><p>GitHub</p></li></ul><h2>Visibility:</h2><ul><li><p>Wellfound</p></li><li><p>Intentional profile building</p></li><li><p>Applying strategically</p></li></ul><h2>Outcome:</h2><ul><li><p>Startup role</p></li><li><p>Equity pathway</p></li><li><p>Potential sponsorship</p></li><li><p>Industry credibility</p></li></ul><div><hr></div><h1>A Bigger Lesson About Learning: Information Alone Is Not Enough</h1><p>One of the strongest takeaways from this conversation was something many learners need to hear:</p><h2>Random information is everywhere.</h2><p>But roadmap clarity is rare.</p><p>Nischal openly reflected that before structured guidance, he did not fully understand how the ecosystem connected.</p><p>This is where many people lose time.</p><p>They may learn:</p><ul><li><p>A little AWS</p></li><li><p>A little Python</p></li><li><p>A little SQL</p></li><li><p>A few tutorials</p></li></ul><p>&#8230;but still struggle to answer:</p><h2>&#8220;How does this all fit together in the real world?&#8221;</h2><p>This is why structured, mentor-led, roadmap-driven learning can accelerate growth.</p><p>Not because information is hidden.</p><p>But because sequence matters.</p><p>When someone helps you understand:</p><ul><li><p>Fundamentals first</p></li><li><p>Why architecture choices matter</p></li><li><p>How projects build credibility</p></li><li><p>How to speak industry language</p></li></ul><p>&#8230;your confidence compounds differently.</p><p>In many ways&#8230;</p><p>That was one of the hidden advantages in his journey.</p><p>Not just learning more.</p><p>Learning in the right order.</p><div><hr></div><h1>My Bigger Reflection for Aspiring Data Engineers</h1><p>We live in a time where many people are overwhelmed by:</p><ul><li><p>AI</p></li><li><p>Agentic systems</p></li><li><p>Cloud</p></li><li><p>Spark</p></li><li><p>Databricks</p></li><li><p>Streaming</p></li></ul><p>And yes&#8230; these matter.</p><p>But often the real question is simpler:</p><h2>&#8220;Can you build enough proof that someone trusts you?&#8221;</h2><p>Because careers are often built in this sequence:</p><p><strong>Learn &#8594; Build &#8594; Explain &#8594; Position &#8594; Opportunity</strong></p><p>Not:</p><p><strong>Learn endlessly &#8594; Hope</strong></p><div><hr></div><h1>Final Thought</h1><p>Nischal&#8217;s journey is not just about securing an opportunity.</p><p>It is about something deeper:</p><h2>Clarity creates momentum.</h2><p>He did not begin with perfect knowledge.</p><p>He began by:</p><ul><li><p>Learning fundamentals</p></li><li><p>Building projects</p></li><li><p>Gaining practical exposure</p></li><li><p>Using platforms like Wellfound strategically</p></li><li><p>Making long-term decisions</p></li></ul><p>For many of you&#8230;</p><p>Especially students, career switchers, or those feeling behind&#8230;</p><p>This matters.</p><p>You may not need the perfect start.</p><p>But you do need:</p><h2>Direction + Execution + Visibility</h2><p>And sometimes&#8230;</p><p>That combination can create opportunities you may not have imagined when you first began.</p><div><hr></div><h2>If you are currently trying to break into Data Engineering:</h2><p>Do not just ask:</p><p><strong>&#8220;What should I learn next?&#8221;</strong></p><p>Also ask:</p><p><strong>&#8220;What can I build that proves I belong?&#8221;</strong></p><p>That question can change everything.</p><div><hr></div><h1>A Final Note From Me</h1><p>Nischal&#8217;s journey did not happen overnight.</p><p>He played the long game.</p><p>With:</p><ul><li><p>Grit</p></li><li><p>Patience</p></li><li><p>Structured learning</p></li><li><p>Real projects</p></li><li><p>Strategic positioning</p></li></ul><p>And over time&#8230;</p><p>That compounded.</p><p>If you are serious about building toward opportunities like this&#8230;</p><p>Not just learning randomly&#8230;</p><p>But understanding roadmap, projects, cloud, and real-world Data Engineering from a deeper lens&#8230;</p><h2>Join my webinar:</h2><p><a href="https://aws.sachin.cloud">https://aws.sachin.cloud</a></p><p>Because sometimes&#8230;</p><p>The right direction, combined with consistent execution, can change far more than you currently realize.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Depth Before Breadth: Is Data Engineering Alone Still Enough Anymore?]]></title><description><![CDATA[A question came in recently from a learner that I believe many professionals are quietly thinking about right now:]]></description><link>https://sachinchandrashekhar.substack.com/p/depth-before-breadth-is-data-engineering</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/depth-before-breadth-is-data-engineering</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Tue, 05 May 2026 09:45:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A question came in recently from a learner that I believe many professionals are quietly thinking about right now:</p><p><strong>&#8220;Is strong Data Engineering alone still enough&#8230; or are we moving toward a world where engineers need to also understand AI, APIs, architecture, orchestration, and broader technical ecosystems?&#8221;</strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>It is a powerful question.</p><p>Because if you look around today, the industry can feel noisy.</p><p>AI is everywhere.<br>Agents are everywhere.<br>Full-stack expectations are growing.<br>Companies are experimenting fast.</p><p>And for many professionals, this creates an uncomfortable fear:</p><p><strong>&#8220;Am I falling behind if I focus deeply on Data Engineering?&#8221;</strong></p><p>My honest answer?</p><p><strong>Not necessarily.<br>But how you approach your growth matters more than ever.</strong></p><div><hr></div><h1>The Real Problem: Many Professionals Are Confusing Expansion With Progress</h1><p>One of the biggest mistakes I see today is that people often assume career growth means learning <em>everything</em> at once.</p><p>So they jump from:</p><p>Data Engineering &#8594; AI &#8594; Full Stack &#8594; Frontend &#8594; APIs &#8594; Agentic AI &#8594; Architecture &#8594; DevOps</p><p>&#8230;and before long, they are overwhelmed.</p><p>This creates a dangerous pattern:</p><p><strong>Breadth without depth.</strong></p><p>They know &#8220;about&#8221; many things&#8230;</p><p>&#8230;but are not truly strong enough in one domain to create credibility.</p><p>And in real-world projects, credibility matters.</p><div><hr></div><h1>What I Am Actually Seeing in the Industry</h1><p>From what I observe:</p><p>Yes &#8212; companies are increasingly asking Data Engineers to participate in:</p><ul><li><p>AI POCs</p></li><li><p>Data foundations for LLM systems</p></li><li><p>API integrations</p></li><li><p>Cross-functional collaboration</p></li><li><p>Automation discussions</p></li><li><p>Tool ecosystem decisions</p></li></ul><p>But this does <strong>not</strong> automatically mean every Data Engineer must immediately become:</p><ul><li><p>AI Engineer</p></li><li><p>Architect</p></li><li><p>Full Stack Developer</p></li><li><p>ML Engineer</p></li><li><p>Agent Orchestrator</p></li></ul><p>The market is still evolving.</p><p>Many of these roles are still being shaped.</p><p>Which means this is not the time for panic.</p><p>This is the time for strategic positioning.</p><div><hr></div><h1>My Personal Approach: Depth First</h1><p>I made a conscious decision myself.</p><p>Even after exploring AI Engineering to prepare for industry conversations&#8230;</p><p><strong>I chose to go deeper into Data Engineering first.</strong></p><p>Why?</p><p>Because Data Engineering itself is already a massive domain:</p><ul><li><p>Cloud</p></li><li><p>SQL</p></li><li><p>Python</p></li><li><p>Spark</p></li><li><p>Distributed systems</p></li><li><p>Databricks</p></li><li><p>Snowflake</p></li><li><p>Architecture</p></li><li><p>Scalability</p></li><li><p>Lakehouse patterns</p></li><li><p>Performance</p></li></ul><p>And AI?</p><p>That is another massive domain on its own.</p><p>Trying to deeply master both too early can dilute focus.</p><p>So I would rather build:</p><h2>Depth first &#8594; Then breadth</h2><p>Instead of:</p><h2>Shallow exposure to everything &#8594; Mastery in nothing</h2><div><hr></div><h1>Why Depth Still Matters More Than Most People Realize</h1><p>Depth gives you something critical:</p><h2>Judgment.</h2><p>This matters even more in the AI era.</p><p>Because yes&#8230;</p><p>AI can absolutely help you write code faster.<br>AI can help you build faster.<br>AI can help you explore adjacent technologies faster.</p><p>But AI cannot replace your ability to ask:</p><p><strong>&#8220;Is this architecture scalable?&#8221;</strong><br><strong>&#8220;Is this pipeline reliable?&#8221;</strong><br><strong>&#8220;Is this design production-worthy?&#8221;</strong><br><strong>&#8220;Is this actually the right solution?&#8221;</strong></p><p>Without foundational depth&#8230;</p><p>You risk accepting AI output at face value.</p><p>And that can become dangerous.</p><p>So the future is likely not:</p><p><strong>AI replaces engineers</strong></p><p>It is more likely:</p><p><strong>Engineers with depth + AI fluency outperform engineers without either</strong></p><div><hr></div><h1>A More Strategic Career Framework</h1><p>Here is the path I currently believe makes the most sense for many professionals:</p><h2>Near Term:</h2><h2>Build undeniable depth</h2><p>Focus on:</p><ul><li><p>SQL</p></li><li><p>Python</p></li><li><p>Cloud</p></li><li><p>Spark</p></li><li><p>Real-world pipelines</p></li><li><p>Architecture thinking</p></li><li><p>Scalability</p></li><li><p>Databricks / Snowflake ecosystem understanding</p></li></ul><div><hr></div><h2>Mid Term:</h2><h2>Add strategic breadth</h2><p>Expand into:</p><ul><li><p>AI fluency</p></li><li><p>Prompting</p></li><li><p>AI validation</p></li><li><p>APIs</p></li><li><p>Automation</p></li><li><p>Agent orchestration</p></li><li><p>Broader system integration</p></li></ul><div><hr></div><h2>Long Term:</h2><h2>Move toward technical leadership</h2><p>This may include:</p><ul><li><p>Architecture ownership</p></li><li><p>Platform decisions</p></li><li><p>Cross-functional oversight</p></li><li><p>Team leadership</p></li><li><p>AI strategy</p></li></ul><div><hr></div><h1>The New Career Advantage</h1><p>I believe the strongest professionals going forward will not necessarily be the people who chase every trend first&#8230;</p><p>They will likely be the people who combine:</p><h2>Strong foundational depth</h2><h2>Clear thinking</h2><h2>Adaptability</h2><h2>Strategic expansion</h2><div><hr></div><h1>So&#8230; Is Data Engineering Alone Enough?</h1><p>Here is my honest take:</p><h2>Yes &#8212; if you are building real depth.</h2><p>But&#8230;</p><h2>Pure tool-only Data Engineering without broader awareness may eventually become limiting.</h2><p>In simple terms:</p><p><strong>Depth gives you credibility</strong><br><strong>Breadth gives you adaptability</strong></p><p>And right now?</p><p>For most people&#8230;</p><p><strong>Depth should probably come first.</strong></p><div><hr></div><h1>Final Thought</h1><p>You do not need to become everything overnight.</p><p>You do not need to panic every time the industry shifts.</p><p>You do need to ask:</p><p><strong>&#8220;What core capability, if mastered deeply, will give me leverage&#8230; and make future expansion easier?&#8221;</strong></p><p>For many professionals today&#8230;</p><p>That answer may still very well be:</p><h2>Data Engineering</h2><p>Not as the final destination&#8230;</p><p>&#8230;but as the foundation.</p><p>And foundations matter.</p><div><hr></div><p><strong>What do you think?</strong><br>Are you currently prioritizing depth&#8230; or trying to balance breadth too early?</p><p></p><h1>Want Clarity On What To Learn First?</h1><p>If you are trying to figure out:</p><p><strong>Should you focus on SQL first? AWS first? Databricks? Snowflake? AI?</strong></p><p>&#8230;and want a practical roadmap instead of random overwhelm&#8230;</p><p>I cover this in my live masterclass:</p><h2>Register here: <a href="https://aws.sachin.cloud">https://aws.sachin.cloud</a></h2><p>I break down:</p><ul><li><p>What to prioritize first</p></li><li><p>How to avoid learning backwards</p></li><li><p>How to build real-world depth</p></li><li><p>How to position yourself for high-paying Cloud &amp; AI-powered Data Engineering roles</p></li></ul><p>Because sometimes&#8230;</p><p>The biggest career advantage is not learning faster.</p><p>It is learning in the right order.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Why AWS Glue, EMR Serverless, and EMR Are Not the Same Thing (And Why It Matters Right Now)]]></title><description><![CDATA[A company I know let go of several engineers last week.]]></description><link>https://sachinchandrashekhar.substack.com/p/why-aws-glue-emr-serverless-and-emr</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/why-aws-glue-emr-serverless-and-emr</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Sun, 03 May 2026 17:36:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A company I know let go of several engineers last week. Not junior folks. A senior leader too.</p><p>This is not a scare post. But it is a reality check.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>The engineers who are safe are the ones who keep adding skills &#8212; consistently, one layer at a time. The ones who understand not just <em>what</em> a tool is, but <em>when</em> to use it, <em>why</em> companies choose it, and <em>what it costs</em>.</p><p>Today I want to break down one of the most misunderstood areas in AWS data engineering: the difference between <strong>AWS Glue</strong>, <strong>EMR Serverless</strong>, and <strong>EMR on EC2</strong>. I covered this in a live session recently, and the questions from the room told me this needs a proper written explanation.</p><p>Let&#8217;s go from scratch.</p><div><hr></div><h2>What Are We Actually Trying to Do?</h2><p>Data engineering is fundamentally simple at its core:</p><ul><li><p>You have data sitting somewhere (a <strong>source</strong>)</p></li><li><p>You want to <strong>process</strong> it &#8212; clean it, transform it, aggregate it</p></li><li><p>You load the result somewhere (a <strong>target</strong>)</p></li></ul><p>In AWS, your source and target are often <strong>S3</strong> &#8212; think of it as Google Drive on the cloud. The interesting question is what sits in the middle: the <strong>processing layer</strong>.</p><p>For small data &#8212; say 1 or 2 GB &#8212; Python on a single machine is fine. You read the file, do your aggregations, write the output. Done.</p><p>But what happens when your data is in the terabytes? Spread across multiple files? Growing every day?</p><p>A single machine can&#8217;t handle it. You need multiple machines working together, splitting the data, processing in parallel, and combining results into one output. That&#8217;s where <strong>Apache Spark</strong> comes in &#8212; it&#8217;s the framework built exactly for this. And PySpark is just Python code that uses Spark under the hood.</p><p>Now, the question becomes: <em>where do you run Spark on AWS?</em></p><p>You have three main options.</p><div><hr></div><h2>Option 1: AWS Glue (Spark)</h2><p>Glue is a fully managed ETL service. When you use it with Spark, here&#8217;s what you do:</p><ul><li><p>Write your PySpark code</p></li><li><p>Choose a <strong>worker type</strong> (e.g., G1X = 4 CPU cores, 16 GB RAM)</p></li><li><p>Choose a <strong>number of workers</strong> (e.g., 4 workers)</p></li><li><p>Run the job</p></li></ul><p>AWS handles everything else. It provisions the machines, installs Spark, runs your code, and tears it all down. You don&#8217;t touch a single server.</p><p>One of those workers is always the <strong>driver</strong> &#8212; it coordinates the work. The rest are <strong>executors</strong> &#8212; they do the actual processing in parallel.</p><p>If you choose 4 workers with G1X, you&#8217;re getting 3 executors &#215; 4 CPU cores = 12 cores and 48 GB RAM total working on your data simultaneously.</p><p>You can also enable <strong>auto-scaling</strong> in Glue, which means it will automatically decide how many workers your job actually needs rather than spinning up all 4 every time.</p><p><strong>The upside:</strong> Simple, fast to get started, fully abstracted. You&#8217;re not managing infrastructure at all.</p><p><strong>The downside:</strong> It&#8217;s the most expensive option. You&#8217;re paying a premium for that simplicity.</p><div><hr></div><h2>Option 2: EMR on EC2</h2><p>EMR stands for Elastic MapReduce. This is where things get more powerful &#8212; and more complex.</p><p>With EMR on EC2, you are creating an actual <strong>cluster</strong> of Linux machines (EC2 instances) on AWS. You choose:</p><ul><li><p>The type of EC2 instance (how many cores, how much RAM)</p></li><li><p>Primary nodes, core nodes, task nodes</p></li><li><p>The number of each</p></li></ul><p>You have far more flexibility than Glue. You can tune every aspect of your infrastructure to match your exact workload.</p><p><strong>But here&#8217;s the catch:</strong> when you turn that cluster on, it stays on. Whether your job is running or not, those EC2 machines are running &#8212; and you are paying for them. You&#8217;re essentially <em>renting</em> Linux machines by the hour.</p><p>The analogy I use: <strong>EMR on EC2 is like renting a car for 3 days. You pay for the car whether you drive it or not.</strong></p><p>To avoid unnecessary costs, companies often build automation around it &#8212; scripts that spin up the cluster when a job needs to run, and tear it down when it&#8217;s done. That adds maintenance overhead.</p><p><strong>Who uses EMR on EC2?</strong> Teams that already have deep big data expertise &#8212; people who&#8217;ve been running on-premise Hadoop or Spark clusters for years and know exactly what they need. When they move to AWS, they replicate that setup in the cloud. It&#8217;s 40&#8211;70% cheaper than Glue, but only worth it if you have the expertise to manage it.</p><div><hr></div><h2>Option 3: EMR Serverless</h2><p>EMR Serverless sits between Glue and EMR on EC2. It gives you the cost efficiency of EMR without the cluster management overhead.</p><p>Here&#8217;s the key difference: instead of choosing workers, you configure <strong>executors and drivers directly</strong>, and set a maximum CPU and memory limit.</p><pre><code><code>Pre-initialized capacity:
  Driver:    1 &#215; (2 vCPU, 4 GB)
  Executors: 3 &#215; (4 vCPU, 16 GB)

Maximum capacity:
  CPU:    40 vCPU
  Memory: 160 GB
</code></code></pre><p>When your job runs, EMR Serverless automatically scales up to whatever resources it needs &#8212; up to your defined maximum. Behind the scenes, AWS is spinning up machines and running executors on them. But you never see those machines, you never configure them, and you don&#8217;t pay for them when nothing is running.</p><p><strong>The Uber analogy:</strong> Glue and EMR Serverless are like calling an Uber. AWS brings the car from its fleet, you pay for the ride, and when you&#8217;re done it&#8217;s gone. EMR on EC2 is like renting a car &#8212; you have it parked in your driveway all day whether you use it or not.</p><p><strong>Who uses EMR Serverless?</strong> Companies that started with Glue, found it getting expensive as their data grew, and wanted a more cost-efficient path without the full complexity of managing EC2 clusters.</p><div><hr></div><h2>How Companies Actually Evolve</h2><p>This is the pattern I see in the real world:</p><ol><li><p><strong>Start with Glue</strong> &#8212; Fast to set up, easy to iterate, perfect when you&#8217;re building out your first pipelines</p></li><li><p><strong>Move to EMR Serverless</strong> &#8212; As data volume grows and the Glue bill becomes noticeable</p></li><li><p><strong>Move to EMR on EC2</strong> &#8212; When the team has deep Spark expertise and wants maximum control and cost efficiency</p></li></ol><p>Most companies hiring AWS data engineers today are somewhere in steps 1 or 2. Knowing all three &#8212; and being able to reason about the tradeoffs &#8212; is what separates a strong candidate from an average one.</p><div><hr></div><h2>One More Thing: Glue Bookmark</h2><p>While we&#8217;re here &#8212; one of the most common interview questions about Glue is about <strong>Glue Bookmark</strong>.</p><p>When you&#8217;re processing files that land in S3 daily, you don&#8217;t want to reprocess yesterday&#8217;s files. Glue Bookmark tracks which files have already been processed, so when your job runs tomorrow, it only picks up the new ones. It&#8217;s Glue&#8217;s built-in mechanism for incremental data processing. Simple concept, but knowing it exists and how to enable it puts you ahead in interviews.</p><div><hr></div><h2>What I Heard From Two Students This Week</h2><p>After the hands-on session where we ran both Glue and EMR Serverless jobs live, two students from the community &#8212; Shilpa and Hema &#8212; shared something that stayed with me.</p><p>Shilpa said:</p><blockquote><p>&#8220;Whoever is new to these technologies &#8212; the skill booster courses are enough to get you started. It&#8217;s not worth spending your first month just learning Python. And the hands-on labs are addictive &#8212; make sure you have enough time when you start, because you won&#8217;t want to stop midway.&#8221;</p></blockquote><p>Hema added:</p><blockquote><p>&#8220;What Sachin explained today about workers and executors &#8212; I could relate to it immediately because I&#8217;d just covered it in the performance tuning section two days ago. It all connected.&#8221;</p></blockquote><p>That&#8217;s the thing about learning this way &#8212; the theory lands differently once you&#8217;ve actually run the job yourself and seen the job fail because you were in the wrong AWS region, or because your account wasn&#8217;t upgraded. That friction is the learning.</p><div><hr></div><h2>Want to See This Live?</h2><p>If you want to go from zero to running real AWS data engineering pipelines &#8212; with Glue, EMR Serverless, PySpark, and production-grade project work &#8212; I run a live webinar where I walk through the RADE program (Real-world AWS Data Engineering) end to end.</p><p>In our Live Classes, it is NOT slides-only theory. We go into the console. We run jobs. We break things and fix them.</p><p><strong>[Register for the next free webinar &#8594;] <a href="https://aws.sachin.cloud">https://aws.sachin.cloud</a></strong></p><p>If you&#8217;re an IT professional working in legacy tools like Informatica, SSIS, or DataStage and you know it&#8217;s time to modernize &#8212; this is where to start.</p><div><hr></div><p><em>Questions or thoughts on Glue vs EMR? Drop them in the comments. I read every one.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[From 3 Years Experience to Multiple Offers — What Actually Made the Difference]]></title><description><![CDATA[I had a session with one of my students, Bhumika who 10Xed her salary.]]></description><link>https://sachinchandrashekhar.substack.com/p/from-3-years-experience-to-multiple</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/from-3-years-experience-to-multiple</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Sat, 02 May 2026 01:10:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3></h3><p>I had a session with one of my students, Bhumika who 10Xed her salary.</p><p>She shared her journey &#8212; openly &#8212; with the entire community.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Not just the outcome.</p><p>But the process behind it.</p><div><hr></div><h3>The Outcome (Context, Not the Story)</h3><p>She had:</p><ul><li><p>~3&#8211;3.5 years of experience</p></li><li><p>A data engineering background</p></li><li><p>Exposure to AWS and pipelines</p></li></ul><p>And in her recent job search:</p><ul><li><p>She got 5+ product-based company calls</p></li><li><p>Cleared multiple interviews</p></li><li><p>Converted 3 offers</p></li></ul><p>That&#8217;s the visible part.</p><p>But that&#8217;s not the interesting part.</p><div><hr></div><h3>What Most People Don&#8217;t See</h3><p>When people hear this, they assume:</p><ul><li><p>&#8220;She must be very talented&#8221;</p></li><li><p>&#8220;She must have been naturally good&#8221;</p></li><li><p>&#8220;She got lucky&#8221;</p></li></ul><p>But when you listen carefully, a very different picture emerges.</p><div><hr></div><h3>1. This Was Planned &#8212; Not Random</h3><p>She didn&#8217;t start applying casually.</p><p>She worked backwards.</p><ul><li><p>Set a <strong>4&#8211;6 month timeline</strong></p></li><li><p>Built a structured plan</p></li><li><p>Aligned her daily schedule</p></li></ul><blockquote><p>2&#8211;4 hours of focused effort. Every single day.</p></blockquote><p>No bursts. No shortcuts.</p><p>Just consistency.</p><div><hr></div><h3>2. Preparation Was Layered</h3><p>She didn&#8217;t treat preparation as:</p><blockquote><p>&#8220;Learn everything once&#8221;</p></blockquote><p>She broke it down into layers:</p><ul><li><p>Coding (Python, SQL)</p></li><li><p>Data engineering concepts</p></li><li><p>AWS architecture</p></li><li><p>Project storytelling</p></li></ul><p>And importantly:</p><blockquote><p>She practiced each of these separately</p></blockquote><div><hr></div><h3>3. She Didn&#8217;t Wait to Feel Ready</h3><p>One thing she said stood out:</p><blockquote><p>&#8220;We never feel that we are fully prepared.&#8221;</p></blockquote><p>And because of that:</p><ul><li><p>She didn&#8217;t wait</p></li><li><p>She started giving interviews early</p></li></ul><p>Even for companies she didn&#8217;t intend to join.</p><div><hr></div><h3>4. Interviews Became the Learning Loop</h3><p>This is where most people get it wrong.</p><p>They treat interviews as:</p><blockquote><p>A final test</p></blockquote><p>She treated them as:</p><blockquote><p>A feedback system</p></blockquote><ul><li><p>Gave interviews</p></li><li><p>Noted every question she couldn&#8217;t answer</p></li><li><p>Filled those gaps</p></li><li><p>Repeated the cycle</p></li></ul><blockquote><p>&#8220;Each interview was a stepping stone.&#8221;</p></blockquote><div><hr></div><h3>5. Storytelling Was a Differentiator</h3><p>She didn&#8217;t just prepare answers.</p><p>She prepared:</p><blockquote><p>How to explain her work</p></blockquote><ul><li><p>Projects</p></li><li><p>Use cases</p></li><li><p>Trade-offs</p></li><li><p>Real-world decisions</p></li></ul><p>Because at a certain level:</p><blockquote><p>Clarity of explanation matters as much as knowledge</p></blockquote><div><hr></div><h3>6. She Optimized for the Market</h3><p>A few practical things she did:</p><ul><li><p>Tailored her resume to each job description</p></li><li><p>Ensured strong keyword match (ATS)</p></li><li><p>Used multiple channels:</p><ul><li><p>Job portals</p></li><li><p>Direct applications</p></li><li><p>Referrals</p></li></ul></li><li><p>Stayed active on LinkedIn</p></li></ul><p>None of this is &#8220;secret.&#8221;</p><p>But very few people do all of it consistently.</p><div><hr></div><h3>7. She Understood Timing</h3><p>She mentioned something important:</p><ul><li><p>Good companies take time</p></li><li><p>Hiring processes stretch over weeks</p></li><li><p>Calls don&#8217;t come immediately</p></li></ul><p>So she:</p><blockquote><p>Started early and stayed patient</p></blockquote><div><hr></div><h3>The Real Takeaway</h3><p>This was not about:</p><ul><li><p>One course</p></li><li><p>One resource</p></li><li><p>One lucky interview</p></li></ul><p>This was about:</p><blockquote><p><strong>Structured effort, over time</strong></p></blockquote><p>That&#8217;s what compounds.</p><div><hr></div><h3>The Part Most People Skip</h3><p>Everyone is willing to:</p><ul><li><p>Watch videos</p></li><li><p>Take notes</p></li><li><p>&#8220;Learn&#8221;</p></li></ul><p>Very few are willing to:</p><ul><li><p>Practice consistently</p></li><li><p>Face interviews early</p></li><li><p>Iterate based on feedback</p></li></ul><p>That&#8217;s the difference.</p><div><hr></div><h3>The Long Game Perspective</h3><p>There is nothing extreme here.</p><p>No hacks.</p><p>No shortcuts.</p><p>Just:</p><ul><li><p>Clarity</p></li><li><p>Consistency</p></li><li><p>Execution</p></li></ul><p>Repeated long enough.</p><div><hr></div><h3>If You&#8217;re in This Phase</h3><p>If you have:</p><ul><li><p>2&#8211;4 years of experience</p></li><li><p>Some exposure to data engineering</p></li><li><p>And are trying to move into stronger roles</p></li></ul><p>Then your focus should not be:</p><blockquote><p>&#8220;What else should I learn?&#8221;</p></blockquote><p>It should be:</p><blockquote><p>&#8220;How do I execute better?&#8221;</p></blockquote><div><hr></div><h3>If You Want to Understand This in a Structured Way</h3><p>I break this down in detail in a live masterclass.</p><ul><li><p>What actually matters in interviews</p></li><li><p>How to structure your preparation</p></li><li><p>How to avoid common mistakes</p></li></ul><p>You can register here:<br></p><p><a href="https://aws.sachin.cloud">https://aws.sachin.cloud</a></p><p>If you attend, you&#8217;ll also get access to my<br><strong>Agentic AI&#8211;powered Data Engineering course</strong> (worth $100 / &#8377;8000)</p><p>Join only if you&#8217;re serious about building this the right way.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Schema Evolution in Databricks Delta: What No One Tells You About Bronze, Silver, and Gold]]></title><description><![CDATA[Best Practices for Bronze, Silver, and Gold with SQL and PySpark]]></description><link>https://sachinchandrashekhar.substack.com/p/schema-evolution-in-databricks-delta</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/schema-evolution-in-databricks-delta</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Tue, 28 Apr 2026 16:58:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!o3LO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>Schema Evolution in Databricks Delta: What No One Tells You About Bronze, Silver, and Gold</h1><p><em>A practical guide for data engineers who want resilient medallion pipelines &#8212; without the silent disasters</em></p><div><hr></div><p>Here&#8217;s a scenario I see constantly with engineers transitioning into modern data engineering:</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>A source system drops a column. Or adds one. Maybe renames it. The ingestion job either crashes immediately or &#8212; worse &#8212; silently swallows the change. Two weeks later, your Gold dashboards are showing wrong numbers, your data science team is asking questions you can&#8217;t answer, and you&#8217;re reverse-engineering what broke and when.</p><p>This is not a Databricks problem. It&#8217;s a schema governance problem. And Delta Lake gives you the tools to solve it &#8212; if you understand where the boundaries actually are.</p><p>Let me break this down layer by layer.</p><div><hr></div><h2>The Core Mental Model: One Rule to Anchor Everything</h2><p>Here&#8217;s the practical rule that should govern your entire medallion architecture:</p><blockquote><p><strong>Let Bronze absorb additive changes safely. Make Silver and Gold evolve intentionally.</strong></p></blockquote><p>That&#8217;s it. Everything else flows from this.</p><p>Schema evolution in Delta Lake is a <em>resilience feature for ingestion</em> &#8212; not a substitute for data contracts and downstream engineering discipline. The moment you treat <code>mergeSchema</code> as a reason to stop thinking about what changes mean, you&#8217;ve taken on invisible debt that will surface at the worst time.</p><div><hr></div><h2>Why Source Schemas Change (And Why You Can&#8217;t Ignore It)</h2><p>New columns get added. Old columns disappear. Names get changed in ways that look harmless but break your joins. Types shift from string to integer because the upstream team &#8220;cleaned up&#8221; their model.</p><p>Delta Lake addresses this with two fundamental features:</p><ul><li><p><strong>Schema enforcement</strong> &#8212; rejects writes that don&#8217;t conform to the existing table schema</p></li><li><p><strong>Schema evolution</strong> &#8212; allows the schema to update when you explicitly opt in</p></li></ul><p>This combination means you&#8217;re not stuck choosing between &#8220;fail on every upstream change&#8221; and &#8220;let anything through.&#8221; You get to be deliberate about it &#8212; by layer.</p><div><hr></div><h2>Medallion Layer Responsibilities</h2><p>Think of each layer as having a different job, and therefore a different schema policy:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!o3LO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!o3LO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png 424w, https://substackcdn.com/image/fetch/$s_!o3LO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png 848w, https://substackcdn.com/image/fetch/$s_!o3LO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png 1272w, https://substackcdn.com/image/fetch/$s_!o3LO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!o3LO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png" width="820" height="365" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39449c60-cecb-497d-b83a-d5756804e64d_820x365.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:365,&quot;width&quot;:820,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:52209,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://sachinchandrashekhar.substack.com/i/195772789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!o3LO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png 424w, https://substackcdn.com/image/fetch/$s_!o3LO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png 848w, https://substackcdn.com/image/fetch/$s_!o3LO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png 1272w, https://substackcdn.com/image/fetch/$s_!o3LO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39449c60-cecb-497d-b83a-d5756804e64d_820x365.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div><hr></div><h2>Bronze: Built to Absorb</h2><p>Bronze should tolerate new columns automatically. This is where <code>mergeSchema</code> earns its keep.</p><p><strong>PySpark &#8212; append with schema evolution:</strong></p><pre><code><code>bronze_df.write \
  .format("delta") \
  .mode("append") \
  .option("mergeSchema", "true") \
  .saveAsTable("main.bronze.orders")
</code></code></pre><p><strong>SQL &#8212; merge with schema evolution:</strong></p><pre><code><code>MERGE INTO main.bronze.orders AS t
USING staging_orders AS s
ON t.order_id = s.order_id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
WITH SCHEMA EVOLUTION;
</code></code></pre><p>Both patterns let a new source column land without dropping and recreating the target table. Table history is preserved. You don&#8217;t need an emergency deployment just because an upstream team added <code>order_region</code> to their API response.</p><p>That&#8217;s the Bronze promise: keep the historical record intact, don&#8217;t fail on harmless additions.</p><div><hr></div><h2>Silver: The Governed Transformation Contract</h2><p>Silver is where the thinking happens.</p><p>Even though Delta <em>can</em> absorb anything upstream, Silver is where column naming, typing, nullability expectations, and business semantics get curated. A new Bronze column doesn&#8217;t automatically deserve a spot in Silver. Someone has to decide:</p><ul><li><p>Does this column belong in the cleaned data model?</p></li><li><p>Does it need transformation before it&#8217;s usable?</p></li><li><p>Are downstream consumers ready for it?</p></li></ul><p><strong>PySpark &#8212; explicit column projection:</strong></p><pre><code><code>silver_df = (
    spark.table("main.bronze.orders")
    .select(
        "order_id",
        "customer_id",
        "order_ts",
        "status",
        "new_source_column"  # added deliberately after review
    )
)

silver_df.write \
  .format("delta") \
  .mode("append") \
  .option("mergeSchema", "true") \
  .saveAsTable("main.silver.orders")
</code></code></pre><p><strong>SQL &#8212; explicit contract merge:</strong></p><pre><code><code>MERGE INTO main.silver.orders AS t
USING (
  SELECT
    order_id,
    customer_id,
    order_ts,
    status,
    new_source_column
  FROM main.bronze.orders
) AS s
ON t.order_id = s.order_id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
WITH SCHEMA EVOLUTION;
</code></code></pre><p>Notice what both of these do: they <em>name</em> what&#8217;s going in. The contract is visible in the code. If someone reviews this PR, they can see that <code>new_source_column</code> was a deliberate choice, not an accident.</p><div><hr></div><h2>Gold: The Stable Product Layer</h2><p>Gold should be the least surprising layer for anyone consuming your data.</p><p>New columns in Gold should only appear when there&#8217;s an actual reporting, analytics, ML, or product use case that requires them. If the new column matters for that use case &#8212; add it. If it doesn&#8217;t &#8212; leave Gold unchanged. Let Bronze and Silver hold it until there&#8217;s a reason to promote it.</p><p>This prevents source-system churn from leaking into dashboards and business-facing datasets. Your BI team should never open a report and find a column they didn&#8217;t know about.</p><div><hr></div><h2>The More Dangerous Problem: Dropped Columns</h2><p>Everyone asks about adding columns. The dropped column scenario is where teams actually get hurt.</p><p>Delta schema evolution helps you absorb <em>additive</em> drift. A removed column is a different situation.</p><p>If a source stops sending a column, Bronze doesn&#8217;t necessarily need to be rebuilt. New rows can still land with <code>NULL</code> in that position. Old rows retain their previous values. The table can keep running.</p><p>The real danger is <strong>downstream logic</strong>. If Silver or Gold notebooks contain:</p><pre><code><code>SELECT dropped_col ...
WHERE dropped_col = 'X' ...
</code></code></pre><p>Those jobs will fail. The column is gone from the data but still referenced in the transformation code.</p><p><strong>Safe response to a dropped column:</strong></p><ol><li><p>Confirm it&#8217;s truly deprecated at the source &#8212; not a temporary outage or deployment issue</p></li><li><p>Update Bronze ingestion expectations if the column was treated as mandatory</p></li><li><p>Remove or replace references in Silver transformations, tests, and expectations</p></li><li><p>Remove or replace references in Gold aggregates, dashboards, and semantic models</p></li><li><p>Optionally clean up the metadata with <code>ALTER TABLE ... DROP COLUMN</code> once everything downstream is updated</p></li></ol><pre><code><code>ALTER TABLE main.silver.orders DROP COLUMN dropped_col;
</code></code></pre><p>When Delta column mapping is enabled, this operation is metadata-only &#8212; fast and non-disruptive, no full table recreation needed.</p><div><hr></div><h2>The <code>SELECT *</code> Question</h2><p>I get asked this constantly. The short answer: use it in Bronze, avoid it in Silver and Gold.</p><p>In curated layers, explicit column lists are the better practice. They make the data contract visible, reviewable, and testable. When you use <code>SELECT *</code> in Silver or Gold, new Bronze columns can flow downstream unexpectedly &#8212; which sounds convenient until it quietly breaks a BI semantic layer or an ML feature pipeline.</p><p>Explicit projections in PySpark work well here:</p><pre><code><code>expected_cols = ["order_id", "customer_id", "order_ts", "status"]
silver_df = spark.table("main.bronze.orders").select(*expected_cols)
</code></code></pre><p>If Bronze evolves, Silver doesn&#8217;t unless you decide it should.</p><div><hr></div><h2>The Governance Model That Actually Works</h2><p>Here&#8217;s a practical framework that balances flexibility and control:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G6uO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G6uO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png 424w, https://substackcdn.com/image/fetch/$s_!G6uO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png 848w, https://substackcdn.com/image/fetch/$s_!G6uO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png 1272w, https://substackcdn.com/image/fetch/$s_!G6uO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G6uO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png" width="826" height="721" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:721,&quot;width&quot;:826,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:102496,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://sachinchandrashekhar.substack.com/i/195772789?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G6uO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png 424w, https://substackcdn.com/image/fetch/$s_!G6uO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png 848w, https://substackcdn.com/image/fetch/$s_!G6uO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png 1272w, https://substackcdn.com/image/fetch/$s_!G6uO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe50eac8e-bf5a-4ef8-9b33-38dc5403c8c3_826x721.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>And the operating model that makes this real:</p><ul><li><p>Upstream teams announce additive <em>and</em> breaking changes before deployment &#8212; this is a lightweight data contract, not just a courtesy message</p></li><li><p>Data engineering validates in a lower environment before production rollout</p></li><li><p>Bronze accepts new columns safely when appropriate</p></li><li><p>Silver and Gold evolve only after a conscious design decision</p></li><li><p>Monitoring and tests detect unexpected schema drift early</p></li></ul><div><hr></div><h2>Two Misconceptions Worth Clearing Up</h2><p><strong>Misconception 1: &#8220;Delta schema evolution means no engineering work after source changes.&#8221;</strong></p><p>It means fewer <em>ingestion failures</em>. It does not mean curated transformations and downstream contracts maintain themselves. Those still require human decisions.</p><p><strong>Misconception 2: &#8220;You have to drop and recreate tables after schema changes.&#8221;</strong></p><p>Almost never. Delta tables can evolve in place, preserving history and avoiding disruptive rebuilds. Even when a column disappears from the source, table recreation is usually unnecessary &#8212; what matters is removing invalid downstream references and deciding later whether to drop the column from metadata.</p><div><hr></div><h2>The Final Position</h2><p>Delta Lake schema evolution is best understood as a <strong>controlled flexibility mechanism</strong>.</p><p>It&#8217;s excellent for keeping Bronze ingestion robust when source systems add columns. It simplifies the physical handling of approved schema changes in Silver and Gold. But stable medallion pipelines still depend on clear ownership, explicit downstream contracts, and coordination between source teams and data engineers.</p><p>The tools work. The question is always whether the <em>process</em> around them is designed with enough intentionality.</p><p>Get that right, and schema changes stop being emergencies. They become routine &#8212; handled calmly, layer by layer, with history intact and consumers informed.</p><div><hr></div><p><em>If you&#8217;re building on Databricks and want to go deeper on medallion architecture design, Unity Catalog governance, or PySpark transformation patterns &#8212; this is exactly the kind of topic we teach in RADE (Real-world AWS Data Engineering). Check out https://dataengineeringhub.in and <a href="https://sachinchandrashekhar.com">https://sachinchandrashekhar.com</a> for more.</em></p><div><hr></div><p><em>Was this useful? Share it with a data engineer who&#8217;s dealt with a schema surprise at 2am. They&#8217;ll appreciate it.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[40% of a Data Engineering Team Got Cut. Here's What They Didn't Know.]]></title><description><![CDATA[A manager I know told me something that stopped me mid-conversation.]]></description><link>https://sachinchandrashekhar.substack.com/p/40-of-a-data-engineering-team-got</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/40-of-a-data-engineering-team-got</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Mon, 27 Apr 2026 00:17:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A manager I know told me something that stopped me mid-conversation.</p><p>His friend &#8212; a manager at Wells Fargo in Bangalore &#8212; watched 40% of his data engineering team get let go.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Not because they were bad engineers. Not because the business was struggling. But because after introducing Amazon Q and an AI co-pilot, the work that once needed 10 people could now be done by 6. The math was simple. The human cost wasn&#8217;t.</p><p>I&#8217;ve seen a version of this at my own company too &#8212; our projects now go through peer code reviews, and where that process used to take significant back-and-forth, I&#8217;m able to implement the suggested changes a lot faster using Amazon Q. What used to be hours of rework is now handled in a fraction of the time.</p><p>This isn&#8217;t a story about fear. It&#8217;s a story about what skills protect you &#8212; and what skills leave you exposed.</p><p>And it starts with understanding the ground beneath your feet.</p><div><hr></div><h2>What Data Engineering Actually Is (Before We Talk About the Fancy Stuff)</h2><p>Strip away the buzzwords and data engineering is just this:</p><p>You have data sitting somewhere. You move it, clean it, transform it, and land it somewhere useful. That&#8217;s it.</p><p>The &#8220;somewhere&#8221; on both ends? That&#8217;s where it gets interesting &#8212; and where most people get lost.</p><p>Your source data could be in a transactional database (Oracle, MySQL), in files dropped by an external vendor, or already sitting in a data lake your upstream team owns. The destination could be a data warehouse, another data lake, or something newer that I&#8217;ll get to in a moment.</p><p>Your job as a data engineer is to build reliable pipelines between those two points. In bulk. At scale. Without losing data. Without it breaking at 2am.</p><p>That&#8217;s the job. The tools change. The fundamentals don&#8217;t.</p><div><hr></div><h2>The Three Eras &#8212; And Why They All Still Exist</h2><h3>Era 1: The Data Warehouse</h3><p>Before data lakes were a thing, companies loaded everything into data warehouses. Teradata. Netezza. Oracle. These were powerful machines, purpose-built for analytical queries.</p><p>They ran on something called MPP &#8212; Massively Parallel Processing &#8212; where data was distributed across nodes and queried in parallel. Fast. Reliable. Battle-tested.</p><p>Also: expensive. We&#8217;re talking $30&#8211;40 million a year for large enterprises. And rigid. Every table had to be modeled upfront. ETL developers would spend weeks just designing the schema before a single row of data moved.</p><p>You had to know what your data looked like <em>before</em> you stored it. That was the deal.</p><h3>Era 2: The Data Lake</h3><p>Then came the data lake.</p><p>The idea was elegant: stop modeling everything upfront. Just dump your data &#8212; structured, semi-structured, raw files, all of it &#8212; into cheap storage. On AWS, that&#8217;s S3. Pennies per gigabyte. No schema required. You figure out what you need later.</p><p>This was the birth of ELT (Extract, Load, Transform) &#8212; load first, transform second. You land the file, throw an AWS Glue Crawler at it to infer the schema, register it in the Glue Catalog, and suddenly you&#8217;ve got a queryable table via Athena. No server. No pre-built schema. Run SQL straight on top of your S3 files.</p><p>You can even build dashboards on this &#8212; Amazon QuickSight can connect directly to Athena. Great for quick POCs. Great for proving the value of new datasets before investing in heavy infrastructure.</p><p><strong>But here&#8217;s the problem nobody told you about:</strong></p><p>You can&#8217;t update a file.</p><p>A CSV sitting in S3 is a file. If record ID 3 changes its address from &#8220;Mumbai&#8221; to &#8220;Bangalore,&#8221; you can&#8217;t just <code>UPDATE</code> that file. There&#8217;s no insert/update/delete in a data lake. No ACID properties. No transactions.</p><p>And then there&#8217;s schema drift. Your source team sends you a file every day with 20 columns. One morning you wake up and it has 25 columns &#8212; or a data type changed from integer to string &#8212; and your pipeline breaks, silently or loudly. There&#8217;s no built-in protection for that.</p><p>Data lakes were cheap and flexible. They were also fragile.</p><h3>Era 3: The Data Lakehouse</h3><p>This is where we are now.</p><p>The data lakehouse takes the cheap, scalable storage of a data lake and adds the transactional capabilities of a data warehouse &#8212; on top of the same S3 layer you already have.</p><p>How? Through open table formats. The three most important ones:</p><ul><li><p><strong>Delta Lake</strong> (popularized by Databricks)</p></li><li><p><strong>Apache Iceberg</strong> (increasingly the industry default, especially on AWS)</p></li><li><p><strong>Apache Hudi</strong> (strong at CDC and streaming use cases)</p></li></ul><p>These aren&#8217;t just file formats. They&#8217;re metadata layers that sit on top of your S3 files and give you things you never had in a plain data lake:</p><p><strong>ACID transactions.</strong> You can now insert, update, and delete individual records. That record ID 3 with the changed address? Just run <code>MERGE INTO</code> and it&#8217;s handled.</p><p><strong>Schema evolution.</strong> When the source team adds columns, the table doesn&#8217;t break. The format adjusts. You can even configure it to reject schema changes until you&#8217;ve explicitly validated them &#8212; which is the right call for production pipelines.</p><p><strong>Time travel.</strong> Every write to a lakehouse table creates a new version. Want to know what your table looked like 3 days ago? Query it. Want to roll back to a previous state because bad data got loaded? Do it. This is version control for your data, built in by default.</p><p>The result: you keep the economics of a data lake (S3 storage, serverless compute) and gain the reliability of a data warehouse. That&#8217;s why lakehouses have become the default architecture at companies serious about their data platform.</p><div><hr></div><h2>The AWS Toolkit, Honestly Explained</h2><p>If you&#8217;re working on AWS, here&#8217;s how the pieces connect &#8212; without the marketing fluff.</p><p><strong>AWS Glue</strong> is your primary ETL service. It&#8217;s serverless, meaning you don&#8217;t manage servers. You write PySpark code, point it at your data, and Glue provisions the cluster, runs the job, and tears it down. You pay only for what you use. For most companies starting out, Glue is the right choice.</p><p>The tradeoff: it&#8217;s a bit more expensive per compute hour than managing your own cluster. That&#8217;s the price of convenience.</p><p><strong>Amazon EMR</strong> (Elastic MapReduce) is what companies graduate to when Glue starts getting expensive. You configure your own Spark cluster &#8212; choose the instance types, the number of nodes, the memory. More control, more complexity, lower cost at scale. EMR also has a serverless flavor if you want a middle ground.</p><p>Both Glue and EMR use PySpark underneath. Same language. Different operating model.</p><p><strong>AWS Athena</strong> is your serverless query engine. It reads directly from S3, uses the Glue Catalog for table metadata, and lets you run SQL on raw files without moving data anywhere. Perfect for exploration, quick POCs, and lightweight transformations before you need something more robust.</p><p><strong>Databricks</strong> sits across all of this as an alternative path &#8212; an end-to-end platform that bundles compute, storage, Delta Lake, ML tooling, and orchestration into one console. If AWS feels like assembling a puzzle from individual pieces, Databricks hands you a mostly-assembled one. The tradeoff is vendor lock-in and cost. Both paths lead to PySpark.</p><p>And that&#8217;s the point: <strong>PySpark is the common thread.</strong> Whether you&#8217;re on Glue, EMR, or Databricks, PySpark is what you&#8217;ll use to process data at scale. Get that foundation right and the rest becomes a matter of configuration.</p><div><hr></div><h2>The Question Nobody Is Asking</h2><p>Here&#8217;s what I see every week in my community:</p><p>Smart, experienced IT professionals &#8212; people who&#8217;ve spent 8, 10, 15 years on Informatica, SSIS, Teradata &#8212; who know their current tools deeply but feel increasingly like those tools are aging out beneath them.</p><p>They&#8217;re right to feel that.</p><p>But the answer isn&#8217;t panic. It&#8217;s not &#8220;learn everything in 3 months.&#8221; The answer is a structured, prioritized skill stack &#8212; built on fundamentals that don&#8217;t change even when the tools do.</p><p>The data engineering lifecycle is: ingest &#8594; store &#8594; process &#8594; serve &#8594; monitor. That hasn&#8217;t changed since the mainframe days. What&#8217;s changed is the technology at each layer, and how quickly AI tools can help you work within those layers.</p><p>The engineers who got cut at Wells Fargo weren&#8217;t doing the wrong things. They were doing things that AI could replicate faster. The engineers who didn&#8217;t get cut were the ones who understood <em>why</em> the systems were designed the way they were &#8212; who could prompt an AI tool intelligently, review its output critically, and architect solutions that humans and machines together couldn&#8217;t produce separately.</p><p>That&#8217;s the skill you&#8217;re building toward. Not memorizing API documentation. Not blindly running tutorials. Understanding the landscape well enough to make real decisions.</p><p>The lake is no longer enough. The warehouse is too expensive to be the whole story. The lakehouse is where production data engineering lives in 2025.</p><p>Now you know why.</p><div><hr></div><p><em>If you&#8217;re an IT professional looking to transition into cloud data engineering &#8212; AWS, PySpark, lakehouse architecture, real production projects &#8212; I run a program called RADE (Real-world AWS Data Engineering).</em></p><p>You can register for my free MasterClass where I talk about the Roadmap to high paying jobs here:</p><p>https://aws.sachin.cloud</p><p>If you attend, you&#8217;ll also get access to my<br><br><strong>Agentic AI&#8211;powered Data Engineering course</strong> (worth $100 / &#8377;8000) &#8212;<br><br>where I show how to use AI tools to accelerate real engineering work.</p><p>Join only if you&#8217;re serious about getting into high paying AWS roles.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[My Student Told Me He Might Get Laid Off. This Is What I Told Him.]]></title><description><![CDATA[The Mistake That Keeps Data Professionals Stuck (Even After 5 Years)]]></description><link>https://sachinchandrashekhar.substack.com/p/my-student-told-me-he-might-get-laid</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/my-student-told-me-he-might-get-laid</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Thu, 16 Apr 2026 23:35:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>My student reached out to me recently for a 1&#8211;1 call.</p><p>He wasn&#8217;t a beginner.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>In fact, he had done what many people struggle to do.</p><p>He had:</p><ul><li><p>Moved from a non-tech background into analytics</p></li><li><p>Built dashboards and pipelines</p></li><li><p>Even worked on a data warehouse and AWS systems</p></li></ul><p>On paper, he was doing well.</p><p>But there was a problem.</p><p>He said:</p><blockquote><p>&#8220;I might get laid off soon. I want to switch to data engineering&#8230; but I&#8217;m not sure how to position myself.&#8221;</p></blockquote><p>This is where most people get stuck.</p><p>Not because they lack skills.</p><p>But because they lack <strong>clarity</strong>.</p><div><hr></div><h2>The Real Problem Was Not Skills</h2><p>As we spoke, one thing became clear.</p><p>He wasn&#8217;t confused about technology.</p><p>He was confused about <strong>identity</strong>.</p><p>His resume reflected:</p><ul><li><p>Data Analyst</p></li><li><p>Analytics Engineer</p></li><li><p>Some Data Engineering work</p></li></ul><p>Everything mixed together.</p><p>And that is exactly what hurts you in the market.</p><p>Because hiring managers don&#8217;t hire &#8220;mixed profiles.&#8221;</p><p>They hire <strong>clarity</strong>.</p><div><hr></div><h2>The Market Does Not Reward &#8220;Jack of All Trades&#8221;</h2><p>At one point, I told him something very simple:</p><blockquote><p>&#8220;Your resume should reflect the role you want &#8212; not just the work you&#8217;ve done.&#8221;</p></blockquote><p>This is where most professionals go wrong.</p><p>They try to be honest.</p><p>They try to show everything.</p><p>They try to say:</p><ul><li><p>&#8220;I&#8217;ve done analytics&#8230;&#8221;</p></li><li><p>&#8220;I&#8217;ve also worked on pipelines&#8230;&#8221;</p></li><li><p>&#8220;I&#8217;ve also touched AWS&#8230;&#8221;</p></li></ul><p>But what the hiring manager sees is:</p><blockquote><p>&#8220;This person is not a specialist.&#8221;</p></blockquote><p>And in today&#8217;s market, that is a problem.</p><p>Because teams are not looking for generalists.</p><p>They are looking for:</p><ul><li><p>Someone who can <strong>own a problem</strong></p></li><li><p>Someone who has gone <strong>deep</strong></p></li><li><p>Someone who can contribute <strong>from day one</strong></p></li></ul><div><hr></div><h2>Depth Changes Everything</h2><p>During the conversation, the student himself said something important.</p><p>When he worked with data engineers, he noticed:</p><ul><li><p>Their systems were <strong>scalable</strong></p></li><li><p>Their code was <strong>structured</strong></p></li><li><p>Everything was <strong>plug-and-play</strong></p></li></ul><p>That was his turning point.</p><p>He realized:</p><blockquote><p>Analytics often solves immediate problems.<br>Engineering solves problems that <em>scale</em>.</p></blockquote><p>And that is the difference.</p><div><hr></div><h2>The Shift You Need to Make</h2><p>If you are trying to move into data engineering, understand this:</p><p>It is not just about learning PySpark.</p><p>It is not just about doing a course.</p><p>It is about <strong>realignment</strong>.</p><p>You need to align:</p><ul><li><p>Your resume</p></li><li><p>Your LinkedIn</p></li><li><p>Your projects</p></li><li><p>Your narrative</p></li></ul><p>All towards one thing:</p><blockquote><p>&#8220;I am a Data Engineer.&#8221;</p></blockquote><p>Not &#8220;trying to become one.&#8221;</p><p>Not &#8220;part-time.&#8221;</p><p>Not &#8220;50-50.&#8221;</p><p>Clear.</p><p>Focused.</p><p>Intentional.</p><div><hr></div><h2>A Practical Strategy I Gave Him</h2><p>I didn&#8217;t give him a motivational speech.</p><p>I gave him a strategy:</p><ol><li><p><strong>Refactor your resume completely</strong></p><ul><li><p>Remove confusion</p></li><li><p>Focus only on data engineering</p></li></ul></li><li><p><strong>A/B test your positioning</strong></p><ul><li><p>Apply for 1&#8211;2 weeks</p></li><li><p>If calls are low &#8594; double down on DE positioning</p></li></ul></li><li><p><strong>Focus on depth, not breadth</strong></p><ul><li><p>PySpark</p></li><li><p>Real-world issues (memory, performance)</p></li><li><p>Systems thinking</p></li></ul></li><li><p><strong>Ignore unnecessary distractions</strong></p><ul><li><p>Not every company needs DSA</p></li><li><p>Not every tool matters</p></li></ul></li></ol><p>Focus on what moves the needle.</p><div><hr></div><h2>The Hard Truth</h2><p>Most people delay this decision.</p><p>They keep one foot in analytics.</p><p>One foot in engineering.</p><p>And they stay stuck for years.</p><p>Because they never commit.</p><div><hr></div><h2>The Long Game</h2><p>Careers are not built on reacting to layoffs.</p><p>They are built on:</p><ul><li><p>deliberate decisions</p></li><li><p>consistent depth</p></li><li><p>and clear positioning</p></li></ul><p>If you do that:</p><p>The market starts recognizing you differently.</p><div><hr></div><h2>If You&#8217;re Serious About This</h2><p>I write about these real scenarios &#8212; not theory.</p><p>Not generic advice.</p><p>But what actually happens when professionals try to transition into data engineering.</p><p>If this is something you&#8217;re working through:</p><p>You can read more here &#8594;<br></p><p>https://dataengineeringhub.substack.com</p><h3>If You&#8217;re Trying to make a transition to higher paying AWS roles</h3><p>If you&#8217;re currently:</p><ul><li><p>Moving from analytics to data engineering</p></li><li><p>Sitting on partial exposure but lacking depth</p></li><li><p>Or trying to reposition yourself in the market</p></li></ul><p>Then the next step is not more random tutorials.</p><p>It&#8217;s understanding:</p><ul><li><p>What actually matters in real-world data engineering</p></li><li><p>What skills move the needle in interviews</p></li><li><p>And how to build depth &#8212; not just surface-level knowledge</p></li></ul><p>I break this down in a live masterclass.</p><p>You can register here:<br></p><p>https://aws.sachin.cloud</p><p>If you attend, you&#8217;ll also get access to my<br><strong>Agentic AI&#8211;powered Data Engineering course</strong> (worth $100 / &#8377;8000) &#8212;<br>where I show how to use AI tools to accelerate real engineering work.</p><p>Join only if you&#8217;re serious about getting into high paying AWS roles.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Market Isn’t Bad. It’s Selective. (Insights From a Community Discussion This Saturday)]]></title><description><![CDATA[This Saturday, during a live discussion with members of our data engineering community, we spoke openly about the job market.]]></description><link>https://sachinchandrashekhar.substack.com/p/the-market-isnt-bad-its-selective</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/the-market-isnt-bad-its-selective</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Sat, 21 Feb 2026 18:11:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>India vs US.<br>Interview patterns.<br>What companies are actually testing.<br>Why some candidates struggle.<br>And why others move ahead quietly.</p><p>Here&#8217;s what became very clear.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>The market is not dead.</p><p>It is selective.</p><p>And selectivity exposes shallow preparation.</p><div><hr></div><h3>What Interviews Actually Look Like Now</h3><p>Across geographies, members shared a similar pattern.</p><p>Companies are no longer impressed by:</p><ul><li><p>&#8220;I completed 5 courses.&#8221;</p></li><li><p>&#8220;I know Spark, Kafka, Airflow, Snowflake.&#8221;</p></li><li><p>&#8220;Here are 200 LeetCode solutions.&#8221;</p></li></ul><p>Instead, interviews are evolving into conversations like:</p><ul><li><p>Why did you choose this architecture?</p></li><li><p>What trade-offs did you consider?</p></li><li><p>How would you reduce cost in this pipeline?</p></li><li><p>What happens if this job fails at 2 AM?</p></li><li><p>How would you design this for scale?</p></li></ul><p>This is engineering thinking.</p><p>Not tutorial repetition.</p><div><hr></div><h3>India vs US: Structural Differences</h3><p>During our discussion, some differences became obvious.</p><h4>&#127470;&#127475; India Market</h4><ul><li><p>Often more tool-specific questioning.</p></li><li><p>More implementation detail focus.</p></li><li><p>Sometimes more interview rounds.</p></li><li><p>Higher competition at mid-level roles.</p></li></ul><h4>&#127482;&#127480; US Market</h4><ul><li><p>Strong emphasis on system design.</p></li><li><p>Ownership mindset.</p></li><li><p>Trade-off discussions.</p></li><li><p>Architectural clarity.</p></li></ul><p>But here&#8217;s the common denominator:</p><p>Both markets reward depth.</p><p>Neither rewards superficial exposure.</p><div><hr></div><h3>Why Many Candidates Struggle</h3><p>One thing that stood out in the discussion:</p><p>Many candidates prepare reactively.</p><p>They:</p><ul><li><p>Watch content.</p></li><li><p>Memorize answers.</p></li><li><p>Practice common questions.</p></li><li><p>Hope the interviewer sticks to script.</p></li></ul><p>But modern interviews are adaptive.</p><p>If you cannot reason through ambiguity,<br>you struggle.</p><p>If you cannot explain decisions,<br>you stall.</p><p>If you cannot connect business needs to technical design,<br>you plateau.</p><p>The market isn&#8217;t punishing you.</p><p>It&#8217;s measuring you.</p><div><hr></div><h3>What Actually Wins (As Observed in Real Conversations)</h3><p>From both experience and community insights, five patterns stand out:</p><p>1&#65039;&#8419; Deep understanding of fundamentals<br>2&#65039;&#8419; Real-world project exposure<br>3&#65039;&#8419; Ability to articulate trade-offs<br>4&#65039;&#8419; Structured practice<br>5&#65039;&#8419; Consistency over bursts</p><p>You don&#8217;t need 20 tools.</p><p>You need mastery of fewer tools at greater depth.</p><div><hr></div><h3>The Calm Advantage</h3><p>The engineers who are moving ahead right now are not panicking.</p><p>They are:</p><ul><li><p>Studying deliberately.</p></li><li><p>Practicing system design.</p></li><li><p>Building projects with cost and scale in mind.</p></li><li><p>Discussing architectures with peers.</p></li><li><p>Thinking long term.</p></li></ul><p>They are playing the long game.</p><div><hr></div><h3>Final Thought</h3><p>If you feel overwhelmed by the market,<br>don&#8217;t ask:</p><blockquote><p>&#8220;Is the market bad?&#8221;</p></blockquote><p>Ask:</p><blockquote><p>&#8220;Am I prepared at the level the market expects?&#8221;</p></blockquote><p>That question changes everything.</p><p>Because in 2026,<br>the winners won&#8217;t be the fastest learners.</p><p>They&#8217;ll be the deepest.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Why Working Harder Stopped Working for Me]]></title><description><![CDATA[On productivity, balance, and unlearning an old success formula]]></description><link>https://sachinchandrashekhar.substack.com/p/why-working-harder-stopped-working</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/why-working-harder-stopped-working</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Sun, 08 Feb 2026 03:12:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Sometimes, we get stuck with the past&#8212;especially when that past brought us success.</p><p>In the summer of 2003, when I was in 12th grade at a private class called <em>Expert Coaching Classes</em> in Mangalore, my maths teacher told us something that stayed with me:</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>&#8220;You shouldn&#8217;t be wasting a single second of your time this year.&#8221;</strong></p><p>I took that advice extremely seriously.</p><p>So seriously that I remember turning down a movie plan with my sister and cousin.<br><em>Kal Ho Na Ho</em> was huge back then&#8212;but I said no. I chose to study instead.</p><p>I sacrificed a lot that year.</p><p>And when the results came out, I scored <strong>96 in Chemistry, 98 in Maths, and 100 in Physics</strong>.</p><p>For more than two decades after that, I was stuck with this formula for success:</p><p><strong>Never waste time.<br>Use every possible hour.</strong></p><p>It worked&#8212;because I was 17.</p><p>But I carried the same mindset into adulthood.<br>Into my career.<br>Into relationships.<br>Into health.<br>Into everything.</p><p>And that&#8217;s where the problems began.</p><div><hr></div><p>I struggled every single year to bring balance into my life.</p><p>Career.<br>Physical health.<br>Mental health.<br>Relationships.</p><p>I would fixate on <em>one</em> thing at a time, obsess over it, and neglect everything else&#8212;exactly how I did in 12th grade.</p><p>Things got even harder after I started teaching AWS in 2024.</p><p>Suddenly, I wasn&#8217;t just balancing a career.<br>I was also building a coaching business.</p><p>Something had to change.</p><div><hr></div><p>So in 2025, I started searching seriously for answers.</p><p>I read a lot.<br>And a few books fundamentally changed how I think about productivity:</p><ul><li><p><em>Buy Back Your Time</em></p></li><li><p><em>Deep Work</em></p></li><li><p><em>Atomic Habits</em></p></li><li><p><em>Digital Minimalism</em></p></li><li><p><em>Indistractable</em></p></li></ul><p>Here&#8217;s what finally clicked for me:</p><p><strong>Productivity is not about cramming more hours.<br>It&#8217;s about creating a schedule&#8212;and respecting it.</strong></p><p>It&#8217;s about intentionally allocating your waking hours so that <em>every important area of your life</em> gets attention.</p><p>And this isn&#8217;t a one-time fix.</p><p>Old habits don&#8217;t die easily.<br>This kind of optimization takes months&#8212;sometimes longer.</p><p>I&#8217;m still working on it as I write this.</p><div><hr></div><p>One powerful lesson I learned from Dan Martell was this:</p><p><strong>Learn to say NO.</strong></p><p>But here&#8217;s the catch:</p><p>You can only confidently say NO when you are <em>crystal clear</em> about what you&#8217;ll do with the time you save.</p><p>In the end, everything boils down to <strong>clarity</strong>.</p><p>Clarity in thought.<br>Clarity in action.</p><p>Most of us don&#8217;t lack solutions.<br>We&#8217;re overwhelmed by too many of them&#8212;and that&#8217;s what makes prioritization so hard.</p><p>So how do you fix that?</p><p>By being persistent.<br>By being relentless about your goals.</p><p>A ship without a captain will drift in the ocean.</p><p>When I look back, I can clearly see how often I drifted&#8212;until recently.</p><div><hr></div><p>One last thing.</p><p>For many years, I ignored one of my mentor&#8217;s simplest pieces of advice:</p><p><strong>Read books.</strong></p><p>If there&#8217;s one thing I&#8217;d strongly recommend&#8212;it&#8217;s this:</p><p>Read.</p><p>Whatever area of life you want to improve, books are a treasure trove of wisdom.</p><p>Go to Perplexity.ai.<br>Ask for the best books on that topic.<br>Pick one.<br>Start reading.</p><p>You won&#8217;t regret it. I promise.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[When 12 Years of Experience Suddenly Isn’t Enough Anymore]]></title><description><![CDATA[Last week, I had a 1-on-1 coaching call with a senior data engineer.]]></description><link>https://sachinchandrashekhar.substack.com/p/when-12-years-of-experience-suddenly</link><guid isPermaLink="false">https://sachinchandrashekhar.substack.com/p/when-12-years-of-experience-suddenly</guid><dc:creator><![CDATA[Sachin Chandrashekhar]]></dc:creator><pubDate>Tue, 03 Feb 2026 01:45:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NLVh!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa146a109-6850-46f9-8a96-514d4d7023c1_236x236.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>12+ years of experience.<br>IBM &#8594; Teradata &#8594; Big Data &#8594; Azure &#8594; Databricks &#8594; Snowflake.<br>Worked in India. Then moved to the US on H1B.<br>Multiple enterprise clients. Big brands. Real production work.</p><p>On paper, this is a <em>strong</em> profile.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>And yet, he was rejected in the <strong>interview</strong>.</p><p>The feedback was blunt:</p><blockquote><p>&#8220;Your projects all look the same.&#8221;</p></blockquote><p>That sentence stayed with me long after the call ended.</p><p><strong>The uncomfortable truth about senior data engineering careers</strong></p><p>This wasn&#8217;t a skill issue.</p><p>He knows Spark.<br>He knows Databricks.<br>He knows Azure deeply.<br>He understands data pipelines end to end.</p><p>The real issue was something more subtle &#8212; and far more common than people realize.</p><p><strong>Repetition.</strong></p><p>Every project on his resume followed the same pattern:</p><ul><li><p>Ingest data into a raw layer</p></li><li><p>Transform with Spark</p></li><li><p>Publish to curated / gold</p></li><li><p>Different company, different timeline&#8230; same architecture</p></li></ul><p>This was cutting-edge in 2018.<br>In 2026, it&#8217;s <em>expected</em>.</p><p>What once made you valuable can quietly become the reason you&#8217;re filtered out.</p><p><strong>Why &#8220;more experience&#8221; isn&#8217;t the answer anymore</strong></p><p>A lot of senior engineers respond to this situation by thinking:</p><blockquote><p>&#8220;I just need one more big project.&#8221;<br>&#8220;I need deeper Spark knowledge.&#8221;<br>&#8220;I need another certification.&#8221;</p></blockquote><p>But that&#8217;s not the real gap.</p><p>The gap is <strong>how your experience is framed</strong> &#8212; and how the market now evaluates senior talent.</p><p>At 10&#8211;15 years of experience, companies aren&#8217;t just hiring:</p><ul><li><p>builders</p></li><li><p>implementers</p></li><li><p>task executors</p></li></ul><p>They&#8217;re hiring people who can:</p><ul><li><p>make architectural decisions</p></li><li><p>justify trade-offs</p></li><li><p>think in cost, scale, and governance</p></li><li><p>explain <em>why</em> one approach beats another</p></li></ul><p>That&#8217;s a different job &#8212; even if the title still says <em>Data Engineer</em>.</p><div><hr></div><h2>The &#8220;Azure &#8594; AWS&#8221; illusion many engineers fall into</h2><p>One part of our conversation really stood out.</p><p>He told me:</p><blockquote><p>&#8220;I actually got more interview calls for AWS roles than I expected.&#8221;</p></blockquote><p>And in those interviews, he did well &#8212; <em>almost</em> well enough.</p><p>He explained confidently:</p><ul><li><p>Databricks &#8594; EMR</p></li><li><p>ADF &#8594; Glue</p></li><li><p>Synapse &#8594; Redshift</p></li></ul><p>The interviewer agreed.<br>He liked the profile.</p><p>But the final response was:</p><blockquote><p>&#8220;I like your understanding&#8230; but you don&#8217;t have real AWS experience.&#8221;</p></blockquote><p>This is where many experienced engineers get stuck.</p><p><strong>Conceptual mapping is not the same as architectural confidence.</strong></p><p>AWS interviews don&#8217;t just test:</p><ul><li><p>Can you build this?</p></li></ul><p>They test:</p><ul><li><p>Why would you choose <em>this</em> service?</p></li><li><p>What happens when data volume grows 10x?</p></li><li><p>How much does this cost at scale?</p></li><li><p>What breaks first?</p></li><li><p>How would you secure it?</p></li></ul><p>That&#8217;s not about tools.<br>That&#8217;s about <strong>decision-making</strong>.</p><div><hr></div><h2>The quiet shift from &#8220;developer&#8221; to &#8220;architect&#8221;</h2><p>Here&#8217;s the part most people don&#8217;t talk about openly:</p><p>You don&#8217;t become an architect by getting promoted.<br>You become one by <strong>changing how you think and speak</strong>.</p><p>Developers focus on:</p><ul><li><p>implementation</p></li><li><p>correctness</p></li><li><p>delivery</p></li></ul><p>Architects focus on:</p><ul><li><p>trade-offs</p></li><li><p>patterns</p></li><li><p>constraints</p></li><li><p>cost</p></li><li><p>governance</p></li><li><p>stakeholder expectations</p></li></ul><p>Same data.<br>Same cloud.<br>Completely different mental model.</p><p>And here&#8217;s the key mistake many people make:</p><p>They wait for the <em>architect title</em> before acting like one.</p><p>In reality, it works the other way around.</p><div><hr></div><h2>Why AWS exposes career ceilings faster</h2><p>In Azure + Databricks ecosystems, many teams converge on a single &#8220;default&#8221; approach:</p><ul><li><p>ADF + Databricks + ADLS</p></li><li><p>Spark everywhere</p></li><li><p>Medallion architecture everywhere</p></li></ul><p>AWS is different.</p><p>For the same problem, you might reasonably choose:</p><ul><li><p>Lambda</p></li><li><p>Glue</p></li><li><p>Athena</p></li><li><p>EMR</p></li><li><p>Redshift</p></li><li><p>ECS/Fargate</p></li></ul><p>Each choice has <strong>cost</strong>, <strong>scale</strong>, and <strong>operational</strong> implications.</p><p>That&#8217;s why AWS interviews feel harder &#8212; not because the tech is harder, but because <strong>you&#8217;re forced to justify your thinking</strong>.</p><p>And that&#8217;s exactly why AWS becomes a growth catalyst for senior engineers who feel stuck.</p><div><hr></div><h2>Architecture isn&#8217;t about knowing everything</h2><p>Another misconception I see a lot:</p><blockquote><p>&#8220;I need to know everything before I move to an architect role.&#8221;</p></blockquote><p>No architect knows everything.</p><p>What they <em>do</em> know is:</p><ul><li><p>how to evaluate options</p></li><li><p>how to say <em>no</em> to bad ideas</p></li><li><p>how to explain trade-offs to non-technical stakeholders</p></li><li><p>how to design systems that won&#8217;t collapse under scale or cost</p></li></ul><p>They also understand adjacent areas:</p><ul><li><p>data governance</p></li><li><p>access control</p></li><li><p>security boundaries</p></li><li><p>data modeling</p></li><li><p>batch vs streaming trade-offs</p></li></ul><p>Not because they implement all of it &#8212; but because they <strong>lead conversations</strong> about it.</p><div><hr></div><h2>The part nobody wants to hear (but needs to)</h2><p>Career pivots don&#8217;t happen in weekends.</p><p>They don&#8217;t happen in 30-day challenges.<br>They don&#8217;t happen after one course.</p><p>Real clarity takes:</p><ul><li><p>months of consistent learning</p></li><li><p>exposure to multiple architectures</p></li><li><p>reflection</p></li><li><p>rewriting your resume</p></li><li><p>rewriting your narrative</p></li></ul><p>For most experienced engineers, this is a <strong>6&#8211;8 month journey</strong>.</p><p>And that&#8217;s okay.</p><p>If you&#8217;re 10&#8211;15 years into your career, this isn&#8217;t a failure point.<br>It&#8217;s an <strong>inflection point</strong>.</p><div><hr></div><h2>A message I want senior engineers to hear clearly</h2><p>If you&#8217;re feeling:</p><ul><li><p>bored</p></li><li><p>repetitive</p></li><li><p>boxed into the same architecture</p></li><li><p>worried about long-term growth</p></li></ul><p>You&#8217;re not behind.</p><p>You&#8217;re just being asked to evolve.</p><p>The engineers who thrive long-term aren&#8217;t the ones who chase every new tool.<br>They&#8217;re the ones who learn how to <strong>think in systems</strong>, <strong>decisions</strong>, and <strong>outcomes</strong>.</p><p>That shift is uncomfortable.<br>But it&#8217;s also where careers open up again.</p><p>I see this pattern very often.<br>And I&#8217;ve seen enough people on the other side of it to say this confidently:</p><p>&#128073; You&#8217;re not stuck.<br>&#128073; You&#8217;re just at the point where you will need add skills that matter.</p><p>And that&#8217;s not a bad place to be. </p><p>You can get to your dream role if you consistently add deep skills!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://sachinchandrashekhar.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Long Game in Data Engineering! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>