Is there something better than SQL?

onlinepersona@programming.dev · 1 year ago

Is there something better than SQL?

JackbyDev@programming.dev · 1 year ago

No, SQL is great for relational data. There’s like decades of research into it. It’s the best.

sip@programming.dev · 1 year ago

relational databases have years of reseach into them, not the query language itself.

sql was built so people other than devs can use it, but we got stuck with it.

normalexit@lemmy.world · 1 year ago

SQL has been around and evolving since the 70s. It is an ISO/IEC standard and is portable across a multitude of databases.

https://en.m.wikipedia.org/wiki/SQL

sip@programming.dev · 1 year ago

portable, my ass. excuse my french.

each system has it’s own dialect and quirks

lysdexic@programming.dev · edit-2 1 year ago

each system has it’s own dialect and quirks

That does not mean that SQL, as specified by one of it’s standard versions, is not portable. It just means that some implementations fail to comply with the standard and/or provide their own extensions.

If an implementation fails to comply with the standard, that’s a failure on the side of the implementation, not a failure of SQL.

lysdexic@programming.dev · 1 year ago

sql was built so people other than devs can use it, but we got stuck with it.

Not really. Being designed with UX in mind, so that it sacrifices conciseness for readability, does not make it something for “people other than devs”.

Likewise, BASIC was also developed with UX in mind, and no one in their right mind would ever claim that it’s not a programming language.

r1veRRR@feddit.de · 1 year ago

SQL is horrible as a language to read or write. There’s a million different variants, because it lacks so many basic things. And when used in other code, you generally end up string concatinating one language in another language, with all the HORRIBLE bugs something like that brings about.

Imagine Backend People said we should just write adhoc Javascript for the frontend by concatinating the “correct” code in the backend.

JackbyDev@programming.dev · 1 year ago

You should use parameterization, not concatenation.

Kissaki@feddit.de · edit-2 1 year ago

most programmers aren’t DB experts and the SQL they output is quite often terrible.

Isn’t that looking at it the wrong way / missing the point?

If you’re fine with simple queries, use an ORM of your tech stack. Once you have to understand querying performance and weigh usage patterns, constraints, and cache performance and size, then it’s not about SQL anymore. It’s about understanding the DBMS you use.

You may ask “why can’t I use a different language for the querying part of it”. But I haven’t seen anything better yet.

Having a common, well-understood, established, documented, inter-product compatible language has a lot of value. Using a different language to the well-established industry standard is in itself an increased hurdle to adoption.

Getting back to the original quote: I don’t think anything else would serve bad developers / non-experts any better.

onlinepersona@programming.dev · 1 year ago

If you’re fine with simple queries, use an ORM of your tech stack. Once you have to understand querying performance and weigh usage patterns, constraints, and cache performance and size, then it’s not about SQL anymore. It’s about understanding the DBMS you use.

Simple queries don’t result in simple SQL. How many joins and subqueries do you think an SQL query would require in order fulfill “Give me the top 10 artists of the 90s whose albums were nominated for the MTV awards but didn’t win”?

In Django looks something like

nineties = (date(1,1,1990), date(31, 12, 1999)
album_range=Q(albums__release_date__range=nineties)
artists = Artists.objects.annotate(
  albums_sold=Sum("albums__sales", filter=album_range)),
).filter(
  album_range,
  nominations__date__range=nineties,
  nominations__won=False
).order_by("-albums_sold")
top_artists = artists[:10]

What if one method wants the result of that but only wants the artists’ names, but another one wanted additional or other fields? In django you could simply use artists.only(*field_names) and each method would provide a different set of field names. What would that look like without a capable ORM? Do you think somebody would refactor the method to add a field_names argument? In my experience the result is a bunch of copy pasted queries that modify the query itself to add the fieldnames.

Another common thing is querying related objects. Say you simply wanted to have information about the record label of the aforementioned artists while handling the artists. A many-to-one relationship (artist has one record label, record label has many artists). You could either artist.record_label while in your for-loop, but that would trigger an query for every artist (1+n problem). Or in django that’s artists.select_related("record_label") and it will get all the record_labels in the same query.
If it’s a many-to-many relationship for example “festivals”, then .prefetch_related() will first select the artists, then make a second query of festivals of those artists, and artist.festivals would be available.

An ORM like django makes that simple. SQL, does not.

So, before we even get to the DB optimisation part (which indices to create, whether a view is better or now, which storage engine to use, WAL size, yadayadayada), there’s an entire interface / language that makes writing bad code very easy.

Spore@lemmy.ml · 1 year ago

Datalog. Basically relational, but have much simpler syntax and semantics,and is able to convey more complicated queries in a composable manner. Mainly used in Clojure-based databases where actual programs instead of strings are used.

onlinepersona@programming.dev · 1 year ago

🤔 interesting. First order logic for querying. I found CozoDB which might be useful. Still not sure about how to map objects to the DB. The tutorial is not very relatable. Lots of numbers and single letters.

Thanks for the hint though. I’ll check it out.

Spore@lemmy.ml · edit-2 1 year ago

deleted by creator

cwagner@lemmy.cwagner.me · edit-2 1 year ago

deleted by creator

boo one@lemmy.one · 1 year ago

I have seen that this is still a problem, even in established enterprise companies.

cwagner@lemmy.cwagner.me · edit-2 1 year ago

deleted by creator

onlinepersona@programming.dev · 1 year ago

“If you don’t know what I know, you shouldn’t be here”

“I don’t make mistakes that end up #3 on the vulnerability list, and if you do, I don’t want you around me”

I find that a gate keeperish attitude.

Miaou@jlai.lu · 1 year ago

Gates exist for a reason

glue_snorter@lemmy.sdfeu.org · 1 year ago

the translation step from binary (program) -> text (SQL) -> binary (server)

Your concern about this is misguided. Inter-process communication always has to cross a barrier, by definition.

I take it http also feels wrong to you?

Binary protocols do exist, e.g. gRPC, but they incur costs of their own.

onlinepersona@programming.dev · 1 year ago

the translation step from binary (program) -> text (SQL) -> binary (server)

Also, the translation step from binary (program) -> text (SQL) -> binary (server), just feels quite wrong. For HTML and CSS, it’s fine, but for SQL, where injection is still in the top 10 security risks, is there something better?

Does this not answer your question?

Imacat@lemmy.dbzer0.com · 1 year ago

I’m absolutely biased as a data engineer who loves SQL, but there are some good reasons why SQL has been the de facto standard for interacting with databases since the 80s.

One of its draws is that it’s easy to understand. I can show a stakeholder that I’m selecting “sum(sale_amount) from transactions where date=yesterday” and they understand it. Many analysts are even able to write complicated queries when they don’t know anything else about programming.

Since it’s declarative, you rarely have to think about all the underlying fuckery that lets you query something like terabytes of data in redshift in minutes.

Debugging is often pretty nice too. I can take some query that didn’t do what it was supposed to and run it over and over in a console until the output is right.

lysdexic@programming.dev · 1 year ago

I’m absolutely biased as a data engineer who loves SQL, but there are some good reasons why SQL has been the de facto standard for interacting with databases since the 80s.

I find it funny how the people who actually have to wrangle data swear by SQL as awesome, but there are always random hacks coming out of the woodwork, who don’t even look at SQL at all, with sweeping statements claiming SQL sucks because reasons.

It’s like the most opinionated people against SQL are the ones who don’t use SQL.

r1veRRR@feddit.de · 1 year ago

I’d recommend everyone check out https://prql-lang.org/. It’s SQL, but readable and writable in a sane way.

And no, SQL is NOT readable or writable for anything involving more than a single join.

luckystarr@feddit.de · 1 year ago

You can write selects with many joins, as long they are regular and either add a column or reduce the result set. You have to write the joins explicitly though. Just shoving all of the restrictions into the where clause will definitely confuse everybody.

koreth@lemm.ee · 1 year ago

SQL, where injection is still in the top 10 security risks

This is absolutely true, but it’s not what it looks like on the surface, and if you dig into the OWASP entry for this, you’ll see they talk about mitigation.

You can completely eliminate the possibility of injection attacks using well-understood technologies such as bind variables, which an ORM will usually use under the covers but which you can also use with your own queries. There are many, many database applications that have never once had a SQL injection vulnerability and never will.

The reason SQL injection is a widespread security risk, to be blunt, is that there are astonishingly large numbers of inexperienced and/or low-skill developers out there who haven’t learned how to use the tools at their disposal. The techniques for avoiding injection vulnerability are simple and have been well-documented for literally decades but they can’t help if a lousy dev decides to ignore them.

Now, a case could be made that it’d be better if instead, we were using a query language (maybe even a variant of SQL) that made injection attacks impossible. I agree in principle, but (a) I think this ends up being a lot harder than it looks if you want to maintain the same expressive power and flexibility SQL has, (b) given that SQL exists, “get bad devs to stop using SQL” doesn’t seem any more likely to succeed than “get bad devs to use bind variables,” and © I have too much faith in the ability of devs to introduce security vulnerabilities against all odds.

frezik@midwest.social · edit-2 1 year ago

It’s also worth noting that the current entry on OWASP is injection generically. It includes SQL, but it also covers things like HTTP links where you concatenate in unchecked user input. SQL injection by itself may no longer be prominent enough make the list.

I’m also going to put the blame squarely on PHP for SQL injection attacks hanging on for so long, particularly when combined with MySQL. That DB didn’t support bind variables for a long time (maybe still doesn’t?). Other languages may have used a library that simulated bind variables for you. Barring that, they tended to always always always show how to use a sanitation function even if it was the first mention in the first tutorial for the top Google result for “sql [language]”. That creates a culture in the language of writing safe SQL code. Not PHP, though; the sanitation functions were there, but they never gave them the prominence that they so badly needed.