A common trap that inexperienced programmers often fall into when dealing with databases is the lack of a full understanding of the complexity of their activities. Knowledge of SQL syntax or how to perform joins are obvious points on the checklist of essential skills. Sometimes, however, these skills are not followed by an understanding of how queries affect the database as a whole. Without a solid understanding, developers can unknowingly write queries that are inefficient or even harmful.
The crux of the problem often lies in the lack of detailed knowledge of how databases manage data, how different SQL commands affect performance, and how proper indexing works. Moreover, a shallow understanding of the database schema can lead to redundant or incorrect data retrieval. This not only slows down the operation of the database, but can also lead to incorrect data analysis, affecting business decisions.
Fortunately, every time advice is needed, a novice programmer has at his disposal a whole 2.25 billion pages (even more, considering that number was true in 2022). However, with all this information around us, and actually because there is so much, some of it is outdated. What worked a decade ago may not be the best approach today due to advances in database management systems, changes in best practices, and improvements in hardware capabilities.
For example, some SQL optimization tips that were once essential to improve performance, such as manually creating indexes for each merge operation, can now be automatically handled by more advanced database engines. Sticking rigidly to these old methodologies can lead to wasted effort and suboptimal performance, especially when newer versions of databases automatically optimize these processes.
Making changes to your database without an implementation plan is like traveling without a map, phone, or even a clear destination — it's risky, and it probably won't end well. Inexperienced programmers often make the mistake of rushing to modify the database with enthusiasm, but without a structured plan, which leads to changes that can cause more chaos than improvement.
Deploying changes involves much more than just tweaking a bit of SQL here or adding an index there. Without a clear plan, these changes can disrupt the service, corrupt data or introduce new errors to a system that previously worked well.
Inexperienced programmers often deploy queries directly in production environments without proper verification, assuming that if the query works without errors, then it must be correct and efficient. This approach can lead to serious problems, including performance bottlenecks, incorrect data retrieval, and even system downtime, due to untested or incorrectly tested queries being introduced into real-world environments.
Skipping query verification skips a crucial step in ensuring they work properly under real-world operating conditions and perform efficiently under typical load scenarios. Without this stage, queries that seemed harmless in a controlled development environment can behave unpredictably or become problematic when subjected to real-world pressure.
A significant oversight that we've all seen more than once or twice is the lack of a plan to roll back changes to the database. Without a clear strategy for rolling back changes made to the database, there is no safety net if something goes wrong. This can lead to prolonged downtime and data discrepancies, as well as complicate the data recovery process.
For example, if a new column is added to the database and inadvertently disrupts other database functions, or migrating the data leads to its loss or corruption, the lack of a rollback plan means that there is no quick way to restore order. The result can be operational disruptions and urgent, often chaotic efforts to fix problems without compromising data integrity.
Inexperienced developers often overlook the importance of verification database performance at scale. A query or system can work flawlessly under a small load of several test cases. However, it can collapse when subjected to the full weight of actual use. Failure to test the operation of the system under the expected operating load can lead to serious problems. Potential consequences include slow response times, time limits and transaction failures when the system is finally deployed.
So why do they decide not to test?
Let's be honest, most people who are into programming don't do it because of their innate love of talking to people. Often, developers are drawn to this field for their love of code, problem-solving, and perhaps the quiet focus it requires. However, when it comes to managing and developing databases in a company, bragging conversations is not a good idea. Keeping lines of communication closed can lead to isolated efforts, redundancy and lost opportunities to use collective knowledge.
When you have a room full of people trying to find a solution without talking to each other, they can either spend hours, maybe days, thinking about it alone — or start a conversation, throw out a few ideas and maybe work them out together in a fraction of that time.
Having or not having a clear Guaranteed Service Level Agreement (SLA) or performance baseline can be the difference between a predictable, optimized system and its abomination.
The lack of a clear performance baseline or SLA means that there is no agreed benchmark for evaluating system performance or responsiveness. This lack of clarity can lead to misunderstandings between developers and customers or between team members themselves. It is difficult to agree on whether the system is working properly. An example would be when a database query returns results within two seconds - is that fast enough? Without benchmarks or performance goals, the answer to such a question may prove impossible.
Not to mention that without well-defined expectations, optimizing system performance becomes a moving goal. Developers may not know which aspects of the system should be prioritized for improvement or how to allocate resources effectively. This can result in wasted effort on something that has little impact on user satisfaction or business goals.
By default, Oracle systems are set to delete diagnostic logs after eight days. Inexperienced developers can be taken by surprise by this default setting, leading to a potential lack of data. Data that can be crucial for diagnosing problems or understanding performance trends over time.
Why does it matter? Diagnostic data in databases such as Oracle contains critical information: error logs, execution histories and system health indicators, etc. This data is invaluable when going back to database operations to indicate where something may have gone wrong. They also play a vital role in fine-tuning the database for ideal, optimal performance based on previous activity.
Finally, it is all too easy to focus on immediate tasks or short-term goals without considering the broader impact of their work on the system as a whole. The ability to look further comes with experience, and the lack of a holistic vision can lead to a fragmented approach. In these cases, individual components are developed or optimized no matter how they fit into the larger project. The result is a system that can function in parts, but lacks coherence and scalability as a whole.