Fun with Hy and Pandas

houseofleft@slrpnk.net · 17 days ago

Man, I sure wish cybertrucks had been around to deflect when I spent 7 years driving a Fiat Panda.

houseofleft@slrpnk.net · 1 month ago

Oh boy, have fun! CTEs have pretty wide support, so you might be in luck (well at least in that respect, in all other cases you’re still using saleforce amd my commiserations are with you)

houseofleft@slrpnk.net · edit-2 1 month ago

I have advice that you didn’t ask for at all!

SQL’s declarative ordering annoys me too. In most languages you order things based on when you want them to happen, SQL doesn’t work like that- you need to order query dyntax based on where that bit goes according to the rules of SQL. It’s meant to aid readability, some people like it a lot,but for me it’s just a bunch of extra rules to remember.

Anyway, for nested expressions, I think CTEs make stuff a lot easier, and SQL query optimisers mean you probably shouldn’t have to worry about performance.

I.e. instead of:

SELECT
  one.col_a,
  two.col_b
FROM one
LEFT JOIN
    (SELECT * FROM somewhere WHERE something) as two
    ON one.x = two.x

you can do this:

WITH two as (
     SELECT * FROM somewhere
     WHERE something
)

SELECT
  one.col_a,
  two.col_b
FROM one
LEFT JOIN two
ON one.x = two.x

Especially when things are a little gnarly with lots of nested CTEs, this style makes stuff a tonne easier to reason with.

houseofleft@slrpnk.net · 1 month ago

I find meat eaters ask me “would you eat grown meat?” a lot, but my response is always just “I guess maybe? I honestly don’t miss meat that much”. I haven’t come across any vegetarians/vegans who are particularly psyched about it either.

This is all speculation, but I’m not particularly convinced there’s much of a market for lab grown meat over soy based products given how much more expensive they need to be.

houseofleft@slrpnk.net · 1 month ago

AI: “Have you tried funding public transport and regulating the carbon industry?”

Ok, now we need to make a new AI so that AI can solve global warming but without using an existing solution that might marginally inconvenience the mega rich.

houseofleft@slrpnk.net · 1 month ago

Fun with Hy and Pandas

houseofleft@slrpnk.net · 2 months ago

Yazi sounds ideal! Does river involve as much set up as dwm? I really love the ideas behind suckless tools but they normally involve a lot or set up to configure hoe I like.

houseofleft@slrpnk.net · 2 months ago

I read ‘Computer Science Distilled’ early on and it really helped me. It’s a very shallow summary of some CS fundamentals, but that’s kind of what you want when you’re starting out- just enough knowledge to know what exists to learn later.

Here’s a link: https://www.goodreads.com/book/show/34189798-computer-science-distilled

houseofleft@slrpnk.net · 2 months ago

Never heard of river but looks really cool! Come to think of it, I haven’t heard or a bunch of this stuff- yazi looks really neat

houseofleft@slrpnk.net · edit-2 2 months ago

I feel like in a lot of ways, most languages are great candidates for this, for lots of different reasons!

Rust: Great choice because it produces a small, very well optimised binary. If you just care about the output binary being small and non-memory intensive, then this is probably a good call.

Buuuuut, Rust’s compilation can be pretty resource intensive, so if you’re actually developing on limited hardware:

C (or curveball option, Hare): produces a small, well optimised binary, with faster compilation. But less framework type things to help you on your way to apis/servers/etc.

Then there’s the fact that it’s a home server, so always on, meaning you actually have generous resources in some ways, because any available CPU is kinda just there to use so:

Python: has a runtime and can be pretty heavy CPU wise, but lots of frameworks, and in all honesty, would wind up being a lot faster to put stuff together in than Rust or C. Probably a great default option until you hit resource issues.

And then why not go whole hog into the world of experimental languages:

Roc: Doesn’t have versions yet, so super new, but should produce a pretty small binary and give you higher level ergonomics than something like Rust or C, especially if you’re into FP.

And then we’re forgetting about:

Haskell: Haskell is the only true programming language, and any time there’s a selection of programming languages, picking the one that isn’t Haskell is the wrong choice. Just ask anyone who programs in Haskell.

But that doesn’t factor in:

Javascript: Sooner or later, everything is just javascript anyway, why even try to resit?

Plus:

Assembly: Can you even trust that it’s well optimised unless you’re writing the assembly yourself?

Edit: My actual serious answer is that Rust + Rocket would be great fun if you’re interested in learning something new, and you’d get very optimised code. If you just want it to use less memory that java and don’t want to spend too much time learning new things then python is probably fine and very quick to learn. Go is a nice halfway point.

houseofleft@slrpnk.net · 2 months ago

I’m a data engineer, use parquet all the time and absolutely love love love it as a format!

arrow (a data format) + parquet, is particularly powerful, and lets you:

Only read the columns you need (with a csv your computer has to parse all the data even if afterwards you discard all but one column)
Use metadata to only read relevant files. This is particularly cool abd probably needs some unpacking. Say you’re reading 10 files, but only want data where “column-a” is greater than 5. Parquet can look at file headers at run time, and figure out if a file doesn’t have any column-a values over five. And therefore, never have to read it!.
Have data in an unambigious format that can be read by multiple programming languages. Since CSV is text, anything reading it will look at a value like “2022-04-05” and say “oh, this text looks like dates, let’s see what happens if I read it as dates”. Parquet contains actual data type information, so it will always be read consistently.

If you’re handling a lot of data, this kind of stuff can wind up making a huge difference.

houseofleft@slrpnk.net · edit-2 3 months ago

I’m a data engineer, and have seen an ungodly ammount of 200-but-actually-no-stuff-is-broken errors and it’s the bane of my life!

We have generic code to handle pulling in api data, and transforming it. It’s obviously check the status code, but any time an API implements this we have to choose between:

having code fail wierdly further down the line because can’t parse the status
adding in some kind of insane if not response.ok or "actually no there's an error really" in response.content logic

Every time you ignore protocols and invent your own, you are making everyone sad.

Will take recommendations of support groups I can join for victims of terrible apis.

houseofleft@slrpnk.net · 3 months ago

Take a look at retropi, which is more or less what you’re talking about!

Depending what you’re wanting to get out the project:

You might be happy just using retropi
You might be happy working on top of retropi
You might want to build something from scratch and just use retropi as a refence

Anywag, I’ll stop being a shill now and just give you the link: https://retropie.org.uk/

houseofleft@slrpnk.net · 3 months ago

I lile this a lot. This reminds me a lot of KQL (a microsoft query language that’s used for a bunch if azure logging).

I use a lot of python pandas/dask- I’ve definitely got used to viewing a table as a series of operations to perform rather than the kind of declarative queries you get in SQL.

At what point is it no longer SQL? If we’re changing fundamental stuff, I’d love a way of writing loops or if statements that isn’t painful too.

houseofleft@slrpnk.net · 3 months ago

Ah Marginalia is absolutely awesome! I feel like modern search is almost an extension of website names now, so if I want to find netflix but don’t know it’s website, I might search for “netflix”. Marginalia is actually a cool way to find new stuff- like you can search “bike maintenance” and find cool blog posts about that topic.

I honestly can’t remember if that’s something google and the like used to do, but doesn’t now, or if they never did. Either way, I love it!