Cryptography nerd

  • 0 Posts
  • 380 Comments
Joined 11 months ago
cake
Cake day: August 16th, 2023

help-circle
  • Humans learn a lot through repetition, no reason to believe that LLMs wouldn’t benefit from reinforcement of higher quality information. Especially because seeing the same information in different contexts helps mapping the links between the different contexts and helps dispel incorrect assumptions. But like I said, the only viable method they have for this kind of emphasis at scale is incidental replication of more popular works in its samples. And when something is duplicated too much it overfits instead.

    They need to fundamentally change big parts of how learning happens and how the algorithm learns to fix this conflict. In particular it will need a lot more “introspective” training stages to refine what it has learned, and pretty much nobody does anything even slightly similar on large models because they don’t know how, and it would be insanely expensive anyway.


  • Yes, but should big companies with business models designed to be exploitative be allowed to act hypocritically?

    My problem isn’t with ML as such, or with learning over such large sets of works, etc, but these companies are designing their services specifically to push the people who’s works they rely on out of work.

    The irony of overfitting is that both having numerous copies of common works is a problem AND removing the duplicates would be a problem. They need an understanding of what’s representative for language, etc, but the training algorithms can’t learn that on their own and it’s not feasible go have humans teach it that and also the training algorithm can’t effectively detect duplicates and “tune down” their influence to stop replicating them exactly. Also, trying to do that latter thing algorithmically will ALSO break things as it would break its understanding of stuff like standard legalese and boilerplate language, etc.

    The current generation of generative ML doesn’t do what it says on the box, AND the companies running them deserve to get screwed over.

    And yes I understand the risk of screwing up fair use, which is why my suggestion is not to hinder learning, but to require the companies to track copyright status of samples and inform ends users of licensing status when the system detects a sample is substantially replicated in the output. This will not hurt anybody training on public domain or fairly licensed works, nor hurt anybody who tracks authorship when crawling for samples, and will also not hurt anybody who has designed their ML system to be sufficiently transformative that it never replicates copyrighted samples. It just hurts exploitative companies.









  • I like that the viture has dimming, lens adjustment, an optional android based neckband device, and miracast is neat, etc.

    If all you want is a compact screen that’s pretty good, and I’m considering getting one, but I want to see some more stuff like integration with your other devices. I see they have remote desktop stuff for gaming, etc, but I’m thinking a bit deeper integration like using a phone app to relay notifications like a HUD, and I want a bit more spatial awareness (might need to rely on stuff like radio beacons for that like UWB). The navigation also seems to rely on either your phone, buttons on the neckband, or a paired 3rd party controller (no official wireless controller), you could make it a bit easier with something that’s maybe keychain sized?

    Imagine if the headset could piggyback on your phone’s AR support + UWB direction finding to let your phone calculate where it is relative to the world, then relay it to the headset which calculate its offset to tell where IT is in the world, it would immediately make Google Maps Live View infinitely more immersive (and overlays don’t need to be perfect, just need to not drift by too many degrees). It would probably be annoying to have to keep scanning with your phone to keep the map accurate though 🤷



  • The point of such an early dev kit isn’t to commit in advance but get people to try out what works, then select what will be in the final product (and maybe releasing updated dev kits on the way). They’re would be a general plan, but this isn’t like a game console dev kit where almost all specs and major features are set in advance, so you’d expect devs to implement multiple variants of each software feature and see what they require of the hardware, how people use it, how popular they are, etc.






  • Exactly. Not promoting it as a dev kit was a major failure. This is the kind of product where you CAN’T do without external feedback, not everybody will use one in a clean office (or even one that stands still), not everybody has the same spatial awareness or motor skills, not supporting controllers locks out numerous people with limited hand movements, etc… As a dev kit it could’ve worked much better at getting the kind of feedback they need from devs working on useful AR stuff



  • Neither of these mention networks, only protocols/schemes, which are concepts. Cryptography exists outside networks, and outside computer science (even if that is where it finds the most use).

    This is ridiculous rules lawyering and isn’t even done well. Such schemes inherently assume multiple communicating parties. Sure you might not need to have a network but you still have to have distinct devices and a communication link of some sort (because if you have a direct trusted channel you don’t need cryptography)

    You’re also wrong about your interpretation.

    Here’s how to read it:

    At point A both parties create their long term identity keys.

    At point B they initiate a connection, and create session encryption keys with a key exchange algorithm (first half of PFS)

    At point C they exchange information over the encrypted channel.

    At point D the session keys are automatically deleted (second half of PFS)

    At point E the long term key of one party is leaked. The contents from B and C can not be recovered because the session key is independent of the long term key and now deleted. This is forward secrecy. The adversary can’t compromise it after the fact without breaking the whole algorithm, they have to attack the clients as the session is ongoing.

    This is motivated for example by how SSL3.0 usually was used with a single fixed RSA keypair per server, letting user clients generate and submit session encryption keys - allowing a total break of all communications with the server of that key is comprised. Long term DH secrets were also often later used when they should be single use. Then we moved on to ECDH where generating new session secrets is fast and everybody adopted real PFS.

    Yes compromising the key means you often get stuff like the database too, etc. Not the point! If you keep deleting sensitive data locally when you should then PFS guarantees it’s actually gone, NSA can’t store the traffic in their big data warehouse and hope to steal the key later to decrypt what you thought you deleted. It’s actually gone.

    And both of the above definitions you quoted means the same as the above.

    In any case, both of these scenarios create an attack vector through which an adversary can get all of your old messages, which, whether you believe violates PFS by your chosen definition or not, does defeat its purpose (perhaps you prefer this phrasing to “break” or “breach”).

    Playing loose with definitions is how half of all broken cryptographic schemes ended up insecure and broken. Being precise with attack definitions allows for better analysis and better defenses.

    Like how better analysis of common attacks on long running chats with PFS lead to “self healing” properties being developed to counter point-in-time leaks of session keys by repeatedly performing key exchanges, better protecting long term keys by for example making sure software like Signal make use of the OS provided hardware backed keystore for it, etc. All of this is modeled carefully and described with precise terms.

    Edit: given modern sandbox techniques in phones, most malware and exploits doesn’t survive a reboot. If malware can compromise your phone at a specific time but can’t break the TPM then once you reboot and your app rekeys then the adversary no longer have access, and this can be demonstrated with mathematical proofs. That’s self healing PFS.

    Anyone can start a forum.

    Fair point, but my cryptography forum (reddit.com/r/crypto) has regulars that include people writing the TLS specifications and other well known experts. They’re hanging around because the forum is high quality, and I’m able to keep quality high because I can tell who’s talking bullshit and who knows their stuff.