hedgehog

joined 2 years ago
[–] hedgehog@ttrpg.network 1 points 3 hours ago* (last edited 2 hours ago)

I think the best way to handle this would be to just encode everything and upload all files. If I wanted some amount of history, I'd use some file system with automatic snapshots, like ZFS.

If I wanted to do what you've outlined, I would probably use rclone with filtering for the extension types or something along those lines.

If I wanted to do this with Git specifically, though, this is what I would try first:

First, add lossless extensions (*.flac, *.wav) to my repo's .gitignore

Second, schedule a job on my local machine that:

  1. Watches for changes to the local file system (e.g., with inotifywait or fswatch)
  2. For any new lossless files, if there isn't already an accompanying lossy files (i.e., identified by being collocated, having the exact same filename, sans extension, with an accepted extension, e.g., .mp3, .ogg - possibly also with a confirmation that the codec is up to my standards with a call to ffprobe, avprobe, mediainfo, exiftool, or something similar), it encodes the file to your preferred lossy format.
  3. Use git status --porcelain to if there have been any changes.
  4. If so, run git add --all && git commit --message "Automatic commit" && git push
  5. Optionally, automatically craft a better commit message by checking which files have been changed, generating text like Added album: "Satin Panthers - EP" by Hudson Mohawke or Removed album: "Brat" by Charli XCX; Added album "Brat and it's the same but there's three more songs so it's not" by Charli XCX

Third, schedule a job on my ~~remote machine~~ server that runs git pull at regular intervals.

One issue with this approach is that if you delete a file (as opposed to moving it), the space is not recovered on your local or your server. If space on your server is a concern, you could work around that by running something like the answer here (adjusting the depth to an appropriate amount for your use case):

git fetch --depth=1
git reflog expire --expire-unreachable=now --all
git gc --aggressive --prune=all

Another potential issue is that what I described above involves having an intermediary git to push to and pull from, e.g., running on a hosted Git forge, like GitHub, Codeberg, etc.. This could result in getting copyright complaints or something along those lines, though.

Alternatively, you could use your server as the git server (or check out forgejo if you want a Git forge as well), but then you can't use the above trick to prune file history and save space from deleted files (on the server, at least - you could on your local, I think). If you then check out your working copy in a way such that Git can use hard links, you should at least be able to avoid needing to store two copies on your server.

~~The other thing to check out, if you take this approach, is git lfs.~~ EDIT: Actually, I take that back - you probably don't want to use Git LFS.

[–] hedgehog@ttrpg.network 2 points 3 days ago (1 children)

Are you talking about a warning for a self signed cert or for not using HTTPS?

[–] hedgehog@ttrpg.network 3 points 3 days ago

It was already known before the whistleblower that:

  1. Siri inputs (all STT at that time, really) were processed off device
  2. Siri had false activations

The “sinister” thing that we learned was that Apple was reviewing those activations to see if they were false, with the stated intent (as confirmed by the whistleblower) of using them to reduce false activations.

There are also black box methods to verify that data isn’t being sent and that particular hardware (like the microphone) isn’t being used, and there are people who look for vulnerabilities as a hobby. If the microphones on the most/second most popular phone brand (iPhone, Samsung) were secretly recording all the time, evidence of that would be easy to find and would be a huge scoop - why haven’t we heard about it yet?

Snowden and Wikileaks dumped a huge amount of info about governments spying, but nothing in there involved always on microphones in our cell phones.

To be fair, an individual phone is a single compromise away from actually listening to you, so it still makes sense to avoid having sensitive conversations within earshot of a wirelessly connected microphone. But generally that’s not the concern most people should have.

Advertising tracking is much more sinister and complicated and harder to wrap your head around than “my phone is listening to me” and as a result makes for a much less glamorous story, but there are dozens, if not hundreds or thousands, of stories out there about how invasive advertising companies’ methods are, about how they know too much, etc.. Think about what LLMs do with text. The level of prediction that they can do. That’s what ML algorithms can do with your behavior.

If you’re misattributing what advertisers know about you to the phone listening and reporting back, then you’re not paying attention to what they’re actually doing.

So yes - be vigilant. Just be vigilant about the right thing.

[–] hedgehog@ttrpg.network 5 points 4 days ago (2 children)

proven by a whistleblower from apple

Assuming you have an iPhone. And even then, the whistleblower you’re referencing was part of a team who reviewed utterances by users with the “Hey Siri” wake word feature enabled. If you had Siri disabled entirely or had the wake word feature disabled, you weren’t impacted at all.

This may have been limited to impacting only users who also had some option like “Improve Siri and Dictation” enabled, but it’s not clear. Today, the Privacy Policy explicitly says that Apple can have employees review your interactions with Siri and Dictation (my understanding is the reason for the settlement is that they were not explicit that human review was occurring). I strongly recommend disabling that setting, particularly if you have a wake word enabled.

If you have wake words enabled on your phone or device, your phone has to listen to be able to react to them. At that point, of course the phone is listening. Whether it’s sending the info back somewhere is a different story, and there isn’t any evidence that I’m aware of that any major phone company does this.

[–] hedgehog@ttrpg.network 2 points 4 days ago (1 children)

Sure - Wikipedia says it better than I could hope to:

As English-linguist Larry Andrews describes it, descriptive grammar is the linguistic approach which studies what a language is like, as opposed to prescriptive, which declares what a language should be like.[11]: 25  In other words, descriptive grammarians focus analysis on how all kinds of people in all sorts of environments, usually in more casual, everyday settings, communicate, whereas prescriptive grammarians focus on the grammatical rules and structures predetermined by linguistic registers and figures of power. An example that Andrews uses in his book is fewer than vs less than.[11]: 26  A descriptive grammarian would state that both statements are equally valid, as long as the meaning behind the statement can be understood. A prescriptive grammarian would analyze the rules and conventions behind both statements to determine which statement is correct or otherwise preferable. Andrews also believes that, although most linguists would be descriptive grammarians, most public school teachers tend to be prescriptive.[11]: 26

[–] hedgehog@ttrpg.network 4 points 4 days ago (4 children)

You might be interested in reading up on the debate of “Prescriptive vs Descriptive” approaches in a linguistics context.

[–] hedgehog@ttrpg.network 2 points 5 days ago

You should try watching the live action series next - I bet you’d love it.

[–] hedgehog@ttrpg.network 3 points 6 days ago

The one I grabbed to test was the ROG Azoth.

I also checked my Iris and Moonlander - both cap out at 6, but I believe I can update that to be higher with QMK or add a config key via Oryx on the Moonlander to turn it on.

[–] hedgehog@ttrpg.network 4 points 6 days ago (2 children)

Per this thread from 2009, the limit was conditional upon using a particular keyboard descriptor documented elsewhere in the spec, but keyboards are not required to use that descriptor.

I tested just now on one of my mechanical keyboards, on MacOS, connected via USB C, using the Online Key Rollover Test, and was able to get 44 keys registered at the same time.

[–] hedgehog@ttrpg.network 1 points 1 week ago

From the Slashdot comments, by Rei:

Or, you can, you know, not fall for clickbait. This is one of those...

Ultimately, we found that the common understanding of AI’s energy consumption is full of holes.

"Everyone Else Is Wrong And I Am Right" articles, which starts out with....

The latest reports show that 4.4% of all the energy in the US now goes toward data centers.

without bothering to mention that AI is only a small percentage of data centre power consumption (Bitcoin alone is an order of magnitude higher), and....

In 2017, AI began to change everything. Data centers started getting built with energy-intensive hardware designed for AI, which led them to double their electricity consumption by 2023.

What a retcon. AI was *nothing* until the early 2020s. Yet datacentre power consumption did start skyrocketing in 2017 - having nothing whatsoever to do with AI. Bitcoin was the big driver.

At that point, AI alone could consume as much electricity annually as 22% of all US households.

Let's convert this from meaningless hype numbers to actual numbers. First off, notice the fast one they just pulled - global AI usage to just the US, and just households. US households use about 1500 TWh of the world's 24400 TWh/yr, or about 6%. 22% of 6% is ~1,3% of electricity (330 TWh/yr). Electricity is about 20% of global energy, so in this scenario AI would be 0,3% of global energy. We're just taking at face value their extreme numbers for now (predicting an order of magnitude growth from today's AI consumption), and ignoring that even a single AI application alone could entirely offset the emissions of all AI combined. Let's look first at the premises behind what they're arguing for this 0,3% of global energy usage (oh, I'm sorry, let's revert to scary numbers: "22% OF US HOUSEHOLDS!"):

  • It's almost all inference, so that simplifies everything to usage growth
  • But usage growth is offset by the fact that AI efficiency is simultaneously improving at faster than Moore's Law on three separate axes, which are multiplicative with each other (hardware, inference, and models). You can get what used to take insanely expensive, server-and-power-hungry GPT-4 performance (1,5T parameters) on a model small enough to run on a cell phone that, run on efficient modern servers, finishes its output in a flash. So you have to assume not just one order of magnitude of inference growth (due to more people using AI), but many orders of magnitude of inference growth.   * You can try to Jevon at least part of that away by assuming that people will always want the latest, greatest, most powerful models for their tasks, rather than putting the efficiency gains toward lower costs. But will they? I mean, to some extent, sure. LRMs deal with a lot more tokens than non-LRMs, AI video is just starting to take off, etc. But at the same time, for example, today LRMs work in token space, but in the future they'll probably just work in latent space, which is vastly more efficient. To be clear, I'm sure Jevon will eat a lot of the gains - but all of them? I'm not so sure about that.   * You need the hardware to actually consume this power. They're predicting by - three years from now - to have an order of magnitude more hardware out there than all the AI servers combined to this point. Is the production capacity for that huge level of increase in AI silicon actually in the works? I don't see it.
[–] hedgehog@ttrpg.network 2 points 1 week ago

There’s a difference between a tool being available to you and a tool being misused by your students.

That said, I wouldn’t trust AI assessments of students to determine if they’re on track right now, either. Whatever means the AI would use needs to be better than grading quizzes, homework, etc., and while I’m not a teacher, I would be very surprised if it were better than any halfway competent teacher’s assessments (thinking in terms of high school and younger, at least - in university IME the expectation is that you self assess during the term and it’s up to you to seek out learning opportunities outside class if you need them, like going to office hours for your prof or TA).

AI isn’t useless, though! It’s just being used wrong. For example, AI can improve OCR, making it more feasible for students to hand in submissions that can be automatically graded, or to improve accessibility for graders. But for that to actually be helpful we need better options on the hardware front and for better integration of those options into grading systems, like affordable batch scanners that you can just drop a stack of 50 assignments into, each a variable number of pages, with software that will automatically sort out the results by assignment and submitter, and automatically organize them into the same place that you put all the digital submissions.

 

This only applies when the homophone is spoken or part of an audible phrase, so written text is safe.

It doesn’t change reality, just how people interpret something said aloud. You could change “Bare hands” to be interpreted as “Bear hands,” for example, but the person wouldn’t suddenly grow bear hands.

You can only change the meaning of the homophones.

It’s not all or nothing. You can change how a phrase is interpreted for everyone, or:

  • You can affect only a specific instance of a phrase - including all recordings of it, if you want - but you need to hear that instance - or a recording of it - to do so. If you hear it live, you can affect everyone else’s interpretation as it’s spoken.
  • You can choose not to affect how it is perceived by people when they say it aloud, and only when they hear it.
  • You can affect only the perception of particular people for a given phrase, but you must either be point at them (pictures work) or be able to refer to them with five or fewer words, at least one of which is a homophone. For example, “my aunt.” Note that if you do this, both interpretations of the homophone are affected, if relevant, (e.g., “my ant”).
  • You can make it so there’s a random chance (in 5% intervals, from 5% to 95%) that a phrase is misinterpreted.
 

cross-posted from: https://lemmy.world/post/19716272

Meta fed its AI on almost everything you’ve posted publicly since 2007

 

The video teaser yesterday about this was already DMCAed by Nintendo, so I don’t think this video will be up long.

view more: next ›