- cross-posted to:
- programmerhumor@lemmy.ml
- cross-posted to:
- programmerhumor@lemmy.ml
As a (data) scientist I am not super familiar with most databases, but duckdb is great for what I need it for.
What’s that? Did you say you needed an RDBMS that can also handle JSON data? Well have I got good news for you!
Mysql / Mari can handle it too! Just use BLOB 🤣
pg can actually query into json fields!
Mysql can too, slow af tho.
oh i didn’t know that. iirc postgres easily beats mongo in json performance which is a bit embarrassing.
Holy, never knew, and never would expect. Postgres truly is king.
And you can add indexes on those JSON fields too!
Kind of. I hope you don’t like performance…
The performance is actually not bad. You’re far better off using conventional columns but in the one off cases where you have to store queryable JSON data, it actually performs quite well.
Quite well is very subjective. It’s much slower than columns or specialized databases like MongoDB.
This is literally me at every possible discussion regarding any other RDBMS.
My coworkers joked that I got paid for promoting Postgres.
Then we switched from Percona to Patroni and everyone agreed that… fuck yes, PostgreSQL is the best.
I used to agree, but recently tried out Clickhouse for high ingestion rate time series data in the financial sector and I’m super impressed by it. Postgres was struggling and we migrated.
This isn’t to say that it’s better overall by any means, but simply that I did actually find a better tool at a certain limit.
I’ve been using ClickHouse too and it’s significantly faster than Postgres for certain analytical workloads. I benchmarked it and while Postgres took 47 seconds, ClickHouse finished within 700ms when performing a query on the OpenFoodFacts dataset (~9GB). Interestingly enough TimescaleDB (Postgres extension) took 6 seconds.
Insertion Query speed Clickhouse 23.65 MB/s ≈650ms TimescaleDB 12.79 MB/s ≈6s Postgres - ≈47s SQLite 45.77 MB/s1 ≈22s DuckDB 8.27 MB/s1 crashed All actions were performed through Datagrip
1 Insertion speed is influenced by reduced networking overhead due to the databases being in-process.
Updates and deletes don’t work as well and not being able to perform an upsert can be quite annoying. However, I found the ReplacingMergeTree and AggregatingMergeTree table engines to be good replacements so far.
Also there’s !clickhouse@programming.dev
deleted by creator
If you can, share your experience!
I also do finance, so if there is anything more to explore, I’m here to listen and learn.
Clickhouse has a unique performance gain when you have a system that isn’t operational data that is normalized and updated often. But rather tables of timeseries data being ingested for write only.
An example, stock prices or order books in real-time. Tens of thousands per second. Clickhouse can write, merge, aggregate records really nicely.
Then selects against ordered data with aggregates are lightning fast. It has lots of nuances to learn and has really powerful capability, but only for this type of use case.
It doesn’t have atomic transactions. Updates and deletes are very poor performing.
For high ingestion (really high) you have to start sharding. It’s nice to have a DB that can do that natively, MongoDB and Influx are very popular, depending on the exact application.
After having suffered with T SQL at MSFT for a number of years… yep, PostGres is almost always the best for almost any enterprise setup, despite what most other corpos seem to think.
Usually their reasons for not using it boil down to:
We would rather pay exorbitant licescing fees of some kind, forever, than rework a few APIs.
Those few APIs already having a fully compatible rewrite, done by me, working in test, prior to that meeting.
Gotta love corpo logic.
Yes, had those issues as well, though lately not a big corp, but mid-sized company.
One manager just wanted MySQL. We had trouble getting required performance from MySQL, when Postgres had good numbers. I had the app fully ready, just to be told no, you make it work in MySQL. So we dropped some ‘useless stuff’ like deferring flushing to disk and such.
I have a colleague like that too, and then the other camp that loves MySQL.
Why do you like postgres
I usually tell people running MySQL that they would probably be better off using a NoSQL key-value store, SQLite, or PostgreSQL, in that order. Most people using MySQL don’t actually need an RDBMS. MySQL occupies this weird niche of being optimised for mostly reads, not a lot of concurrency and cosplaying as a proper database while being incompatible with SQL standards.
incompatible with SQL standards.
Wait… Wait a minute, is that Oracle’s entrance music‽
I made several lengthy presentations about many features, mainly those that are/were missing in MySQL.
In short, MySQL (has been) shit since its inception, with insane defaults and lacking SQL support.
After Oracle bought it, it got better, but it’s catching up with stuff that Postgres has had for 20+ years (in some cases).
Also, fuck Oracle, it’s a shit company.
Edit: if I had to pick the best features I can’t live without, it would be ‘returning’, copy mode and arrays
Oracle:
Only the best in B2B marketing for our shit software.
EDIT:
hah ok, round two, more directly playing on the actual company name:
Oracle:
We tell you what you think you want to hear.
I have to admit though, I’ve never admined the Oracle DB, but I did talk a lot with people who did.
I remember over 10 years ago discussing transactional DDLs as I heard Oracle does it, too, just to listen to 5 minute lecture about how it’s nowhere near as simple.
As a complete newb to Postgres, I LOVE arrays.
Postgres feels like all of the benefits of a database and a document store.
Yeah, that was the goal.
First make it feature-complete document-oriented database, then make if peroformant.
And you can feel the benefits in every step of the way. Things just work, features actually complement each other… and there’s always a way to make any crazy idea stick.
Things happen magically with docker. Container needs PostgreSQL? Expose the port, define a volume, username and password, connect service to that port, forget PostgreSQL’s existence until data corruption.
Not data corruption, but I replaced by mistake my .env file for authentik, containing the password for the postgresql database…
Cue a couple existential crisis for not having set up backups, thinking about nuking the whole installation, learning about postgresql, and finally managing to manually set another password.
Yeah, I feel several years older now…
forget PostgreSQL’s existence until data corruption.
Oh, so about 2 hours then LMAO
…ok, I’m morbidly curious. How did you manage to do that?
first thing i’d ask it is how to pronounce SQL
Sequel with external collaborators.
Squeal with the homies.
15 years ago I called it S-Q-L and then I was told that it’s wrong and it’s “Sequel”, and they kept calling it Sequel in college so for the past 10 years I’ve called it Sequel, My-Sequel, Sequel-lite, Postgres, transact-sequel, etc. Now y’all are telling me it’s not Sequel
Yup, and it’s S-Q-L not sequel (🤢)
squeal gang rise up!
?
I’ve been working on and with sql dbs since… 2011?
Earlier than that if you don’t count professional work.
Always pronounced it Sequel, as has everyone I have worked with, at least of those who actually have some kind of software dev related role.
Its got two syllables.
Quicker and easier to say than three syllables.
It isn’t pronounceable as a word, it is an initialism because the letters that compromise it do not allow it to be pronounced as a word. Unlike something like NASA which is a full blown acronym because it can be pronounced
Do you say hetips for HTTPS?
The sequel thing didn’t even start naturally, it picked up this sequel moniker because of some ancient trademark beef in the 70s between the original devs when it was named “Sequel” and some company (That isn’t even in business anymore)
They renamed it SQL and out of protest against the company people continued to call it sequel even though it makes no sense and 50 damn years later here we are. Everybody involved with direct involvement is probably dead or longggg since retired. It wasn’t termed because it was easier to say and it sure as hell wasn’t termed because its proper.
If it was originally called SQL and the above never happened, I guarantee it would just be another DNS or HTTP and many many pointless debates about it would have never happened
Disclaimer, this doesn’t apply to the MS product that is called sequel
Do you say hetips for HTTPS?
No, because there isn’t an easily pronouncable equivalent word that already exists in english.
The sequel thing didn’t even start naturally, it picked up this sequel moniker because of some ancient trademark beef in the 70s between the original devs when it was named “Sequel” and some company (That isn’t even in business anymore)
They renamed it SQL and out of protest against the company people continued to call it sequel even though it makes no sense and 50 damn years later here we are.
Yep, and I’ve worked with a bunch of old timers who were around when that happened, and picked up their pronunciation.
If it was originally called SQL and the above never happened, I guarantee it would just be another DNS or HTTP and many many pointless debates about it would have never happened.
I mean, I am not … debating in the sense of ‘my way is objectively correct and everyone ahould say it this way’.
Obviously I know what anyone means if they say S Q L and this does not bother me, I just am used to more commonly saying it as Sequel.
Though it is worth mentioning that… history did in fact happen, the original name was SEQUEL, for Structured English QUEry Language, and it had to be changed because a small aircraft company happened to already own the trademark for ‘SEQUEL’.
Disclaimer, this doesn’t apply to the MS product that is called sequel.
Ah, well perhaps that explains why I am so used to the Sequel pronunciation;
I used to work for MSFT, and a number of other Seattle area companies with siginifcant SQL database backends…
… and, given that Seattle was mainly known for Boeing before it was mainly known for Microsoft and Amazon and Starbucks, it does make sense that Seattle area old timers would get pissed over a rival, foreign aircraft company (Hawker Siddely, later merged into BAE) forcing a name change of the software they routinely use.
…
EDIT: So basically, it actually was originally an acroynm, and then got forced to become an initialism, and most the people I’ve worked with and learned from remember when it just was a pronouncable acronym.
I just view it as a pointless 50 year old squabble, it’s current name is SQL to me so S-Q-L it is, ig I didn’t have “old timers” to corrupt me during the formative years of my career though so maybe that’s why.
Regardless, my real beef in this is if someone makes hiring/firing/promotion decisions based on that. Like a fun office debate about it, cool, it’s whatever. Choosing not to hire or promote someone over something so petty is asinine IMO
I just view it as a pointless 50 year old squabble, it’s current name is SQL to me so S-Q-L it is, ig I didn’t have “old timers” to corrupt me during the formative years of my career though so maybe that’s why.
I mean hey, there ya go!
Regardless, my real beef in this is if someone makes hiring/firing/promotion decisions based on that. Like a fun office debate about it, cool, it’s whatever. Choosing not to hire or promote someone over something so petty is asinine IMO
Oh I completely, 1 million percent agree, and that kind of bullshit was a huge factor in why I left MSFT and went to work for other places, rofl!
Way, waaaaay too many coked up MBAs with tiny small dick syndrome, who compensate by developing a god complex and constantly shaming people over not knowing all the latest buzzwords, which they often themselves just literally heard for the first time in their previous meeting.
That’s easy, but PostgreSQL is pronounced Postgres-Q-L.
No need. It’s the best DB… Until you need something portable
My laptops runs postgres, but it is still pretty portable
I’m running it on a raspberry pi, how much more portable could you need?
What do you mean by “portable”?
Just if you need to be able to take it with you.
The whole point of a database is that you leave it where it is though
I think the OP is trying to talk about SQLite, so yeah, he could really be talking about carrying it on his phone.
But it’s just such a weird word to use there that I can’t really be sure.
Or portable like on a USB stick that you can put in any computer instead of installed on a single system.
Either way, the funny thing is that Postgres can do both too. You may not want to use it for those, but you absolutely can.
Ohhhh right, that’s the base part right?