I really want podcast apps to hook into SponsorBlock.
Right now I'm listening to very few English podcasts because of all the advertisement. I switched to German and Swedish because most of them don't have any advertisement in them.
Podcasts often dynamically generate ads at the point of download, making the SponsorBlock-approach unviable: since the media is expected to be variable-length you can't store media positions that map to advertisement segments.
You could potentially match on audio, though -- look for the 15 seconds of podcast audio preceding the ad, and the 15 seconds following it, if folks reported it in a sponsorblock way.
Alternatively, we could build a shazam-style database of 30 second podcast ads, then skip them when they're identified. There isn't much variety out there.
Ah interesting, that just shows that I have successfully avoided subscribing to those. The few ones with ads I listen to get the ads read by the hosts like in YouTube videos.
like others have pointed out, this makes automatic detection easier not harder if true. Just cut the segments that move around, audio analysis and even transcripts (they don't have to be good! just good enough to identify missing segment!) are pretty mature.
I listen to several ad supported English language podcasts, and most of them seem to have difficulties with Pocket Casts AntennaPod, or some other part of my setup. The only ones that do get ads placed are the ones that use Spotify Megaphone for their backend.
Podcasts are a leftover from the non centralized and non-monetized internet of the past. Because is that most Podcasts are still available as rss feeds, so you should only ever get adds if they are spoken by the Podcasts hosts. Ate you taking about those? Only something like sponsorblock would help against those.
I use antennapod (fdroid) on android to listen to Podcasts. Sine hosts always start their podcast with an add, but you can autoskip the first minute of a certain podcast with antennapod every time. It has a setting for that. Antenna pod itself is foss software without adds.
Yeah, any decent podcast app that has a 10s / 15s time skip is the only reliable way to deal with ads. Just skip ahead a few times until the ad is over.
If there were a reliable way to auto-skip ads then ads would lose all their value which could shut down some of our favorite podcasts. It sucks that ads are a necessary evil for podcasting, but there is no clean way around that unless we dismantled capitalism and switched to some hybrid of market socialism and public funding
Ingest podcast feeds, crowdsource hashes of whole and partial sections of the downloaded audio, which should be a good start to auto-tag dynamically inserted ads.
For non-dynamic ads, provide an interface to manually identify their start/end, and publish for others. The same interface could be used to add chapters and other metadata.
Then you’d just point your podcast app to an RSS feed you self host.
I propose Listenarr, unless this has already been taken.
Alternatively what you're describing sounds like SponsorBlock but for podcasts. You probably wouldn't have to rehost the actual audio files to accomplish this, just have a podcast client/addon that allows user submissions for ad segments and a database somewhere that can host the metadata for ad breaks.
Biggest issue is probably that you're probably building or forking an existing podcast app to do it, and some podcasts dynamically insert ads so it's possible that peoples downloaded files could have different ad segments/times.
I thought I explained how to handle the dynamically inserted ads, but I’ll elaborate a little here.
If your Listenarr instance is part of a broader network of other instances, they’ll all potentially receive a unique file with different ads inserted, but they’ll typically be inserted at the same cut location in the program timeline. Listenarr would calculate the hash of the entire file, but also sub spans of various lengths.
If the hash of the full file is the same among instances, you know everyone is getting the same file, and any time references suggested for metadata will apply to everyone.
If the full file hash is different, Listenarr starts slicing it up and generating hashes of subsections to help identify where common and variant sections are. Common sections will usually be the actual content, variants are likely tailored ads. The broader the Listenarr network, the greater the sample size for hashes, which will help automate identification. In fact, the more granular and specific the targeting of inserted ads, the easier it will be to identify them.
Once you have the file sections sufficiently hashed, tagged, and identified, you can easily stitch together a sanitised media stream into a file any podcast app can ingest.
You could shove this function into a podcast player, but then you’d need to replicate all the existing permutations of player applications.
The beauty of the current podcast environment is it’s just RSS feeds that point to audio files in a standard way. This permits handling by a shim proxy in the middle of the transaction between the publisher and the player.
This could also be a way to better incorporate media into the fediverse. One example is the chapters and transcripts generated could be directly referenced in Lemmy and Mastodon posts.
But how do you learn about raycons? Seriously, there are some podcasters that i really like. The podcast and the people behind it. But it's getting really hard to not dislike someone that's willingly trying to sell you garbage. Like your best friend from high school that is trying to pull you in his pyramid scheme.
I used to listen to the marc maron podcast, because he's pretty good at talking to people and he had some cool guests. His sponsors were that stamp company and some website company. Now marcs whole thing is that he's a boomer that doesn't like new things. One day he did his show and then it was time for a commercial and he was like: and today's sponsor is ROCKET LEAGUE. ROCKET LEAGUE is a fun and exciting.... What in the fuck. Thins isn't just a scam, you have not the slightest idea what you are advertising for.
During the early couple years of MBMBAM, I didn’t mind their ad reads. They made it funny. I never skipped Cumtown’s ad reads because they read them so shitty it was extremely funny.
Every other podcast I detest the reads.
Fuck you Shopify, your loud grating DING makes me flinch every time it blasts in my ears. I hate you.
I saw sponsorblock-ml and am playing with it using whisperx for transcript/timestamps and ffmpeg for cutting out the timestamps that were detected by sponsorblock-ml then reserving that audio as an rss feed.
You can set Pocket Casts to skip the first or last X seconds of an episode, which I’ve found helps. I also set fast forward to go 30s and rewind to 15 so it’s easier to scrub forward through an ad and I’m never too far off when I go over and have to rewind.
I like Antenna Pod for this - my BT connections let me use the Forward 30 Seconds feature when m driving or running. Since most ads are 30 seconds long, I can cruise through them easily.
People who are looking for direct integration between podcast players and SponsorBlock seem to be missing that a lot of podcasts these days that do have advertising in them oftentimes have dynamic ads where the ad audio will change depending on the day, the geographical location of the download, etc. So SponsorBlock can't actually account for what are essentially dynamic timestamps Whereas with YouTube you typically have fairly static timestamps that can be shared across a user base, only smaller podcasts are really going to be able to be captured by SponsorBlock unless someone discovers a way to mod an Android APK to essentially prevent the client-side compilation of ads and the original podcast audio assuming that there is a podcast app that does this on the client side.
Does this happen on iOS too? I listen to podcasts every night for an hour and never hear any ads except for the ones encoded with the file, ie sponsor ads.