1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2025-07-09 06:48:30 +00:00

Merge branch 'yt-dlp:master' into roku-new-extractor

This commit is contained in:
Sipherdrakon 2025-05-29 00:10:00 -04:00 committed by GitHub
commit 62d364b93d
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
174 changed files with 11065 additions and 4154 deletions

View File

@ -192,7 +192,7 @@ jobs:
with:
path: ./repo
- name: Virtualized Install, Prepare & Build
uses: yt-dlp/run-on-arch-action@v2
uses: yt-dlp/run-on-arch-action@v3
with:
# Ref: https://github.com/uraimo/run-on-arch-action/issues/55
env: |
@ -256,7 +256,7 @@ jobs:
with:
path: |
~/yt-dlp-build-venv
key: cache-reqs-${{ github.job }}
key: cache-reqs-${{ github.job }}-${{ github.ref }}
- name: Install Requirements
run: |
@ -331,19 +331,16 @@ jobs:
if: steps.restore-cache.outputs.cache-hit == 'true'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
cache_key: cache-reqs-${{ github.job }}
repository: ${{ github.repository }}
branch: ${{ github.ref }}
cache_key: cache-reqs-${{ github.job }}-${{ github.ref }}
run: |
gh extension install actions/gh-actions-cache
gh actions-cache delete "${cache_key}" -R "${repository}" -B "${branch}" --confirm
gh cache delete "${cache_key}"
- name: Cache requirements
uses: actions/cache/save@v4
with:
path: |
~/yt-dlp-build-venv
key: cache-reqs-${{ github.job }}
key: cache-reqs-${{ github.job }}-${{ github.ref }}
macos_legacy:
needs: process
@ -411,7 +408,7 @@ jobs:
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
python devscripts/install_deps.py -o --include build
python devscripts/install_deps.py --include curl-cffi
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-6.11.1-py3-none-any.whl"
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-6.13.0-py3-none-any.whl"
- name: Prepare
run: |
@ -460,7 +457,7 @@ jobs:
run: |
python devscripts/install_deps.py -o --include build
python devscripts/install_deps.py
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-6.11.1-py3-none-any.whl"
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-6.13.0-py3-none-any.whl"
- name: Prepare
run: |

View File

@ -6,7 +6,7 @@ on:
- devscripts/**
- test/**
- yt_dlp/**.py
- '!yt_dlp/extractor/*.py'
- '!yt_dlp/extractor/**.py'
- yt_dlp/extractor/__init__.py
- yt_dlp/extractor/common.py
- yt_dlp/extractor/extractors.py
@ -16,7 +16,7 @@ on:
- devscripts/**
- test/**
- yt_dlp/**.py
- '!yt_dlp/extractor/*.py'
- '!yt_dlp/extractor/**.py'
- yt_dlp/extractor/__init__.py
- yt_dlp/extractor/common.py
- yt_dlp/extractor/extractors.py

View File

@ -38,3 +38,5 @@ jobs:
run: ruff check --output-format github .
- name: Run autopep8
run: autopep8 --diff .
- name: Check file mode
run: git ls-files --format="%(objectmode) %(path)" yt_dlp/ | ( ! grep -v "^100644" )

View File

@ -742,3 +742,36 @@ lfavole
mp3butcher
slipinthedove
YoshiTabletopGamer
Arc8ne
benfaerber
chrisellsworth
fries1234
Kenshin9977
MichaelDeBoey
msikma
pedro
pferreir
red-acid
refack
rysson
somini
thedenv
vallovic
arabcoders
mireq
mlabeeb03
1271
CasperMcFadden95
Kicer86
Kiritomo
leeblackc
meGAmeS1
NeonMan
pj47x
troex
WouterGordts
baierjan
GeoffreyFrogeye
Pawka
v3DJG6GL
yozel

View File

@ -4,6 +4,267 @@ # Changelog
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
-->
### 2025.05.22
#### Core changes
- **cookies**: [Fix Linux desktop environment detection](https://github.com/yt-dlp/yt-dlp/commit/e491fd4d090db3af52a82863fb0553dd5e17fb85) ([#13197](https://github.com/yt-dlp/yt-dlp/issues/13197)) by [mbway](https://github.com/mbway)
- **jsinterp**: [Fix increment/decrement evaluation](https://github.com/yt-dlp/yt-dlp/commit/167d7a9f0ffd1b4fe600193441bdb7358db2740b) ([#13238](https://github.com/yt-dlp/yt-dlp/issues/13238)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
#### Extractor changes
- **1tv**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/41c0a1fb89628696f8bb88e2b9f3a68f355b8c26) ([#13168](https://github.com/yt-dlp/yt-dlp/issues/13168)) by [bashonly](https://github.com/bashonly)
- **amcnetworks**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/464c84fedf78eef822a431361155f108b5df96d7) ([#13147](https://github.com/yt-dlp/yt-dlp/issues/13147)) by [bashonly](https://github.com/bashonly)
- **bitchute**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/1d0f6539c47e5d5c68c3c47cdb7075339e2885ac) ([#13081](https://github.com/yt-dlp/yt-dlp/issues/13081)) by [bashonly](https://github.com/bashonly)
- **cartoonnetwork**: [Remove extractor](https://github.com/yt-dlp/yt-dlp/commit/7dbb47f84f0ee1266a3a01f58c9bc4c76d76794a) ([#13148](https://github.com/yt-dlp/yt-dlp/issues/13148)) by [bashonly](https://github.com/bashonly)
- **iprima**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/a7d9a5eb79ceeecb851389f3f2c88597871ca3f2) ([#12937](https://github.com/yt-dlp/yt-dlp/issues/12937)) by [baierjan](https://github.com/baierjan)
- **jiosaavn**
- artist: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/586b557b124f954d3f625360ebe970989022ad97) ([#12803](https://github.com/yt-dlp/yt-dlp/issues/12803)) by [subrat-lima](https://github.com/subrat-lima)
- playlist, show: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/317f4b8006c2c0f0f64f095b1485163ad97c9053) ([#12803](https://github.com/yt-dlp/yt-dlp/issues/12803)) by [subrat-lima](https://github.com/subrat-lima)
- show: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/6839276496d8814cf16f58b637e45663467928e6) ([#12803](https://github.com/yt-dlp/yt-dlp/issues/12803)) by [subrat-lima](https://github.com/subrat-lima)
- **lrtradio**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/abf58dcd6a09e14eec4ea82ae12f79a0337cb383) ([#13200](https://github.com/yt-dlp/yt-dlp/issues/13200)) by [Pawka](https://github.com/Pawka)
- **nebula**: [Support `--mark-watched`](https://github.com/yt-dlp/yt-dlp/commit/20f288bdc2173c7cc58d709d25ca193c1f6001e7) ([#13120](https://github.com/yt-dlp/yt-dlp/issues/13120)) by [GeoffreyFrogeye](https://github.com/GeoffreyFrogeye)
- **niconico**
- [Fix error handling](https://github.com/yt-dlp/yt-dlp/commit/f569be4602c2a857087e495d5d7ed6060cd97abe) ([#13236](https://github.com/yt-dlp/yt-dlp/issues/13236)) by [bashonly](https://github.com/bashonly)
- live: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7a7b85c9014d96421e18aa7ea5f4c1bee5ceece0) ([#13045](https://github.com/yt-dlp/yt-dlp/issues/13045)) by [doe1080](https://github.com/doe1080)
- **nytimesarticle**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/b26bc32579c00ef579d75a835807ccc87d20ee0a) ([#13104](https://github.com/yt-dlp/yt-dlp/issues/13104)) by [bashonly](https://github.com/bashonly)
- **once**: [Remove extractor](https://github.com/yt-dlp/yt-dlp/commit/f475e8b529d18efdad603ffda02a56e707fe0e2c) ([#13164](https://github.com/yt-dlp/yt-dlp/issues/13164)) by [bashonly](https://github.com/bashonly)
- **picarto**: vod: [Support `/profile/` video URLs](https://github.com/yt-dlp/yt-dlp/commit/31e090cb787f3504ec25485adff9a2a51d056734) ([#13227](https://github.com/yt-dlp/yt-dlp/issues/13227)) by [subrat-lima](https://github.com/subrat-lima)
- **playsuisse**: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/d880e060803ae8ed5a047e578cca01e1f0e630ce) ([#12466](https://github.com/yt-dlp/yt-dlp/issues/12466)) by [v3DJG6GL](https://github.com/v3DJG6GL)
- **sprout**: [Remove extractor](https://github.com/yt-dlp/yt-dlp/commit/cbcfe6378dde33a650e3852ab17ad4503b8e008d) ([#13149](https://github.com/yt-dlp/yt-dlp/issues/13149)) by [bashonly](https://github.com/bashonly)
- **svtpage**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/ea8498ed534642dd7e925961b97b934987142fd3) ([#12957](https://github.com/yt-dlp/yt-dlp/issues/12957)) by [diman8](https://github.com/diman8)
- **twitch**: [Support `--live-from-start`](https://github.com/yt-dlp/yt-dlp/commit/00b1bec55249cf2ad6271d36492c51b34b6459d1) ([#13202](https://github.com/yt-dlp/yt-dlp/issues/13202)) by [bashonly](https://github.com/bashonly)
- **vimeo**: event: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/545c1a5b6f2fe88722b41aef0e7485bf3be3f3f9) ([#13216](https://github.com/yt-dlp/yt-dlp/issues/13216)) by [bashonly](https://github.com/bashonly)
- **wat.tv**: [Improve error handling](https://github.com/yt-dlp/yt-dlp/commit/f123cc83b3aea45053f5fa1d9141048b01fc2774) ([#13111](https://github.com/yt-dlp/yt-dlp/issues/13111)) by [bashonly](https://github.com/bashonly)
- **weverse**: [Fix live extraction](https://github.com/yt-dlp/yt-dlp/commit/5328eda8820cc5f21dcf917684d23fbdca41831d) ([#13084](https://github.com/yt-dlp/yt-dlp/issues/13084)) by [bashonly](https://github.com/bashonly)
- **xinpianchang**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/83fabf352489d52843f67e6e9cc752db86d27e6e) ([#13245](https://github.com/yt-dlp/yt-dlp/issues/13245)) by [garret1317](https://github.com/garret1317)
- **youtube**
- [Add PO token support for subtitles](https://github.com/yt-dlp/yt-dlp/commit/32ed5f107c6c641958d1cd2752e130de4db55a13) ([#13234](https://github.com/yt-dlp/yt-dlp/issues/13234)) by [bashonly](https://github.com/bashonly), [coletdjnz](https://github.com/coletdjnz)
- [Add `web_embedded` client for age-restricted videos](https://github.com/yt-dlp/yt-dlp/commit/0feec6dc131f488428bf881519e7c69766fbb9ae) ([#13089](https://github.com/yt-dlp/yt-dlp/issues/13089)) by [bashonly](https://github.com/bashonly)
- [Add a PO Token Provider Framework](https://github.com/yt-dlp/yt-dlp/commit/2685654a37141cca63eda3a92da0e2706e23ccfd) ([#12840](https://github.com/yt-dlp/yt-dlp/issues/12840)) by [coletdjnz](https://github.com/coletdjnz)
- [Extract `media_type` for all videos](https://github.com/yt-dlp/yt-dlp/commit/ded11ebc9afba6ba33923375103e9be2d7c804e7) ([#13136](https://github.com/yt-dlp/yt-dlp/issues/13136)) by [bashonly](https://github.com/bashonly)
- [Fix `--live-from-start` support for premieres](https://github.com/yt-dlp/yt-dlp/commit/8f303afb43395be360cafd7ad4ce2b6e2eedfb8a) ([#13079](https://github.com/yt-dlp/yt-dlp/issues/13079)) by [arabcoders](https://github.com/arabcoders)
- [Fix geo-restriction error handling](https://github.com/yt-dlp/yt-dlp/commit/c7e575e31608c19c5b26c10a4229db89db5fc9a8) ([#13217](https://github.com/yt-dlp/yt-dlp/issues/13217)) by [yozel](https://github.com/yozel)
#### Misc. changes
- **build**
- [Bump PyInstaller to v6.13.0](https://github.com/yt-dlp/yt-dlp/commit/17cf9088d0d535e4a7feffbf02bd49cd9dae5ab9) ([#13082](https://github.com/yt-dlp/yt-dlp/issues/13082)) by [bashonly](https://github.com/bashonly)
- [Bump run-on-arch-action to v3](https://github.com/yt-dlp/yt-dlp/commit/9064d2482d1fe722bbb4a49731fe0711c410d1c8) ([#13088](https://github.com/yt-dlp/yt-dlp/issues/13088)) by [bashonly](https://github.com/bashonly)
- **cleanup**: Miscellaneous: [7977b32](https://github.com/yt-dlp/yt-dlp/commit/7977b329ed97b216e37bd402f4935f28c00eac9e) by [bashonly](https://github.com/bashonly)
### 2025.04.30
#### Important changes
- **New option `--preset-alias`/`-t` has been added**
This provides convenient predefined aliases for common use cases. Available presets include `mp4`, `mp3`, `mkv`, `aac`, and `sleep`. See [the README](https://github.com/yt-dlp/yt-dlp/blob/master/README.md#preset-aliases) for more details.
#### Core changes
- [Add `--preset-alias` option](https://github.com/yt-dlp/yt-dlp/commit/88eb1e7a9a2720ac89d653c0d0e40292388823bb) ([#12839](https://github.com/yt-dlp/yt-dlp/issues/12839)) by [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
- **utils**
- `_yield_json_ld`: [Make function less fatal](https://github.com/yt-dlp/yt-dlp/commit/45f01de00e1bc076b7f676a669736326178647b1) ([#12855](https://github.com/yt-dlp/yt-dlp/issues/12855)) by [seproDev](https://github.com/seproDev)
- `url_or_none`: [Support WebSocket URLs](https://github.com/yt-dlp/yt-dlp/commit/a473e592337edb8ca40cde52c1fcaee261c54df9) ([#12848](https://github.com/yt-dlp/yt-dlp/issues/12848)) by [doe1080](https://github.com/doe1080)
#### Extractor changes
- **abematv**: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/f5736bb35bde62348caebf7b188668655e316deb) ([#12859](https://github.com/yt-dlp/yt-dlp/issues/12859)) by [Kiritomo](https://github.com/Kiritomo)
- **atresplayer**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/839d64325356310e6de6cd9cad28fb546619ca63) ([#11424](https://github.com/yt-dlp/yt-dlp/issues/11424)) by [meGAmeS1](https://github.com/meGAmeS1), [seproDev](https://github.com/seproDev)
- **bpb**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/80736b9c90818adee933a155079b8535bc06819f) ([#13015](https://github.com/yt-dlp/yt-dlp/issues/13015)) by [bashonly](https://github.com/bashonly)
- **cda**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/9032f981362ea0be90626fab51ec37934feded6d) ([#12975](https://github.com/yt-dlp/yt-dlp/issues/12975)) by [bashonly](https://github.com/bashonly)
- **cdafolder**: [Extend `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/cb271d445bc2d866c9a3404b1d8f59bcb77447df) ([#12919](https://github.com/yt-dlp/yt-dlp/issues/12919)) by [fireattack](https://github.com/fireattack), [Kicer86](https://github.com/Kicer86)
- **crowdbunker**: [Make format extraction non-fatal](https://github.com/yt-dlp/yt-dlp/commit/4ebf41309d04a6e196944f1c0f5f0154cff0055a) ([#12836](https://github.com/yt-dlp/yt-dlp/issues/12836)) by [seproDev](https://github.com/seproDev)
- **dacast**: [Support tokenized URLs](https://github.com/yt-dlp/yt-dlp/commit/e7e3b7a55c456da4a5a812b4fefce4dce8e6a616) ([#12979](https://github.com/yt-dlp/yt-dlp/issues/12979)) by [bashonly](https://github.com/bashonly)
- **dzen.ru**: [Rework extractors](https://github.com/yt-dlp/yt-dlp/commit/a3f2b54c2535d862de6efa9cfaa6ca9a2b2f7dd6) ([#12852](https://github.com/yt-dlp/yt-dlp/issues/12852)) by [seproDev](https://github.com/seproDev)
- **generic**: [Fix MPD extraction for `file://` URLs](https://github.com/yt-dlp/yt-dlp/commit/34a061a295d156934417c67ee98070b94943006b) ([#12978](https://github.com/yt-dlp/yt-dlp/issues/12978)) by [bashonly](https://github.com/bashonly)
- **getcourseru**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/741fd809bc4d301c19b53877692ae510334a6750) ([#12943](https://github.com/yt-dlp/yt-dlp/issues/12943)) by [troex](https://github.com/troex)
- **ivoox**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/7faa18b83dcfc74a1a1e2034e6b0369c495ca645) ([#12768](https://github.com/yt-dlp/yt-dlp/issues/12768)) by [NeonMan](https://github.com/NeonMan), [seproDev](https://github.com/seproDev)
- **kika**: [Add playlist extractor](https://github.com/yt-dlp/yt-dlp/commit/3c1c75ecb8ab352f422b59af46fff2be992e4115) ([#12832](https://github.com/yt-dlp/yt-dlp/issues/12832)) by [1100101](https://github.com/1100101)
- **linkedin**
- [Support feed URLs](https://github.com/yt-dlp/yt-dlp/commit/73a26f9ee68610e33c0b4407b77355f2ab7afd0e) ([#12927](https://github.com/yt-dlp/yt-dlp/issues/12927)) by [seproDev](https://github.com/seproDev)
- events: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/b37ff4de5baf4e4e70c6a0ec34e136a279ad20af) ([#12926](https://github.com/yt-dlp/yt-dlp/issues/12926)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
- **loco**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f5a37ea40e20865b976ffeeff13eeae60292eb23) ([#12934](https://github.com/yt-dlp/yt-dlp/issues/12934)) by [seproDev](https://github.com/seproDev)
- **lrtradio**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/74e90dd9b8f9c1a5c48a2515126654f4d398d687) ([#12801](https://github.com/yt-dlp/yt-dlp/issues/12801)) by [subrat-lima](https://github.com/subrat-lima)
- **manyvids**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/77aa15e98f34c4ad425aabf39dd1ee37b48f772c) ([#10907](https://github.com/yt-dlp/yt-dlp/issues/10907)) by [pj47x](https://github.com/pj47x)
- **mixcloud**: [Refactor extractor](https://github.com/yt-dlp/yt-dlp/commit/db6d1f145ad583e0220637726029f8f2fa6200a0) ([#12830](https://github.com/yt-dlp/yt-dlp/issues/12830)) by [seproDev](https://github.com/seproDev), [WouterGordts](https://github.com/WouterGordts)
- **mlbtv**: [Fix device ID caching](https://github.com/yt-dlp/yt-dlp/commit/36da6360e130197df927ee93409519ce3f4075f5) ([#12980](https://github.com/yt-dlp/yt-dlp/issues/12980)) by [bashonly](https://github.com/bashonly)
- **niconico**
- [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/25cd7c1ecbb6cbf21dd3a6e59608e4af94715ecc) ([#13008](https://github.com/yt-dlp/yt-dlp/issues/13008)) by [doe1080](https://github.com/doe1080)
- [Remove DMC formats support](https://github.com/yt-dlp/yt-dlp/commit/7d05aa99c65352feae1cd9a3ff8784b64bfe382a) ([#12916](https://github.com/yt-dlp/yt-dlp/issues/12916)) by [doe1080](https://github.com/doe1080)
- live: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/1d45e30537bf83e069184a440703e4c43b2e0198) ([#12809](https://github.com/yt-dlp/yt-dlp/issues/12809)) by [Snack-X](https://github.com/Snack-X)
- **panopto**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/9d26daa04ad5108257bc5e30f7f040c7f1fe7a5a) ([#12925](https://github.com/yt-dlp/yt-dlp/issues/12925)) by [seproDev](https://github.com/seproDev)
- **parti**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/425017531fbc3369becb5a44013e26f26efabf45) ([#12769](https://github.com/yt-dlp/yt-dlp/issues/12769)) by [benfaerber](https://github.com/benfaerber)
- **raiplay**: [Fix DRM detection](https://github.com/yt-dlp/yt-dlp/commit/dce82346245e35a46fda836ca2089805d2347935) ([#12971](https://github.com/yt-dlp/yt-dlp/issues/12971)) by [DTrombett](https://github.com/DTrombett)
- **reddit**: [Support `--ignore-no-formats-error`](https://github.com/yt-dlp/yt-dlp/commit/28f04e8a5e383ff531db646190b4be45554610d6) ([#12993](https://github.com/yt-dlp/yt-dlp/issues/12993)) by [bashonly](https://github.com/bashonly)
- **royalive**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/e1847535e28788414a25546a45bebcada2f34558) ([#12817](https://github.com/yt-dlp/yt-dlp/issues/12817)) by [CasperMcFadden95](https://github.com/CasperMcFadden95)
- **rtve**: [Rework extractors](https://github.com/yt-dlp/yt-dlp/commit/f07ee91c71920ab1187a7ea756720e81aa406a9d) ([#10388](https://github.com/yt-dlp/yt-dlp/issues/10388)) by [meGAmeS1](https://github.com/meGAmeS1), [seproDev](https://github.com/seproDev)
- **rumble**: [Improve format extraction](https://github.com/yt-dlp/yt-dlp/commit/58d0c83457b93b3c9a81eb6bc5a4c65f25e949df) ([#12838](https://github.com/yt-dlp/yt-dlp/issues/12838)) by [seproDev](https://github.com/seproDev)
- **tokfmpodcast**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/91832111a12d87499294a0f430829b8c2254c339) ([#12842](https://github.com/yt-dlp/yt-dlp/issues/12842)) by [selfisekai](https://github.com/selfisekai)
- **tv2dk**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/a3e91df30a45943f40759d2c1e0b6c2ca4b2a263) ([#12945](https://github.com/yt-dlp/yt-dlp/issues/12945)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
- **tvp**: vod: [Improve `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/4e69a626cce51428bc1d66dc606a56d9498b03a5) ([#12923](https://github.com/yt-dlp/yt-dlp/issues/12923)) by [seproDev](https://github.com/seproDev)
- **tvw**: tvchannels: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/ed8ad1b4d6b9d7a1426ff5192ff924f3371e4721) ([#12721](https://github.com/yt-dlp/yt-dlp/issues/12721)) by [fries1234](https://github.com/fries1234)
- **twitcasting**: [Fix livestream extraction](https://github.com/yt-dlp/yt-dlp/commit/de271a06fd6d20d4f55597ff7f90e4d913de0a52) ([#12977](https://github.com/yt-dlp/yt-dlp/issues/12977)) by [bashonly](https://github.com/bashonly)
- **twitch**: clips: [Fix uploader metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/1ae6bff564a65af41e94f1a4727892471ecdd05a) ([#13022](https://github.com/yt-dlp/yt-dlp/issues/13022)) by [1271](https://github.com/1271)
- **twitter**
- [Fix extraction when logged-in](https://github.com/yt-dlp/yt-dlp/commit/1cf39ddf3d10b6512daa7dd139e5f6c0dc548bbc) ([#13024](https://github.com/yt-dlp/yt-dlp/issues/13024)) by [bashonly](https://github.com/bashonly)
- spaces: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/70599e53b736bb75922b737e6e0d4f76e419bb20) ([#12911](https://github.com/yt-dlp/yt-dlp/issues/12911)) by [doe1080](https://github.com/doe1080)
- **vimeo**: [Extract from mobile API](https://github.com/yt-dlp/yt-dlp/commit/22ac81a0692019ac833cf282e4ef99718e9ef3fa) ([#13034](https://github.com/yt-dlp/yt-dlp/issues/13034)) by [bashonly](https://github.com/bashonly)
- **vk**
- [Fix chapters extraction](https://github.com/yt-dlp/yt-dlp/commit/5361a7c6e2933c919716e0cb1e3116c28c40419f) ([#12821](https://github.com/yt-dlp/yt-dlp/issues/12821)) by [seproDev](https://github.com/seproDev)
- [Fix uploader extraction](https://github.com/yt-dlp/yt-dlp/commit/2381881fe58a723853350a6ab750a5efc9f10c85) ([#12985](https://github.com/yt-dlp/yt-dlp/issues/12985)) by [seproDev](https://github.com/seproDev)
- **youtube**
- [Add context to video request rate limit error](https://github.com/yt-dlp/yt-dlp/commit/26feac3dd142536ad08ad1ed731378cb88e63602) ([#12958](https://github.com/yt-dlp/yt-dlp/issues/12958)) by [coletdjnz](https://github.com/coletdjnz)
- [Add extractor arg to skip "initial_data" request](https://github.com/yt-dlp/yt-dlp/commit/ed6c6d7eefbc78fa72e4e60ad6edaa3ee2acc715) ([#12865](https://github.com/yt-dlp/yt-dlp/issues/12865)) by [leeblackc](https://github.com/leeblackc)
- [Add warning on video captcha challenge](https://github.com/yt-dlp/yt-dlp/commit/f484c51599a6cd01eb078ea7dc9bbba942967774) ([#12939](https://github.com/yt-dlp/yt-dlp/issues/12939)) by [coletdjnz](https://github.com/coletdjnz)
- [Cache signature timestamps](https://github.com/yt-dlp/yt-dlp/commit/61c9a938b390b8334ee3a879fe2d93f714e30138) ([#13047](https://github.com/yt-dlp/yt-dlp/issues/13047)) by [bashonly](https://github.com/bashonly)
- [Detect and warn when account cookies are rotated](https://github.com/yt-dlp/yt-dlp/commit/8cb08028f5be2acb9835ce1670b196b9b077052f) ([#13014](https://github.com/yt-dlp/yt-dlp/issues/13014)) by [coletdjnz](https://github.com/coletdjnz)
- [Detect player JS variants for any locale](https://github.com/yt-dlp/yt-dlp/commit/c2d6659d1069f8cff97e1fd61d1c59e949e1e63d) ([#13003](https://github.com/yt-dlp/yt-dlp/issues/13003)) by [bashonly](https://github.com/bashonly)
- [Do not strictly deprioritize `missing_pot` formats](https://github.com/yt-dlp/yt-dlp/commit/74fc2ae12c24eb6b4e02c6360c89bd05f3c8f740) ([#13061](https://github.com/yt-dlp/yt-dlp/issues/13061)) by [bashonly](https://github.com/bashonly)
- [Improve warning for SABR-only/SSAP player responses](https://github.com/yt-dlp/yt-dlp/commit/fd8394bc50301ac5e930aa65aa71ab1b8372b8ab) ([#13049](https://github.com/yt-dlp/yt-dlp/issues/13049)) by [bashonly](https://github.com/bashonly)
- tab: [Extract continuation from empty page](https://github.com/yt-dlp/yt-dlp/commit/72ba4879304c2082fecbb472e6cc05ee2d154a3b) ([#12938](https://github.com/yt-dlp/yt-dlp/issues/12938)) by [coletdjnz](https://github.com/coletdjnz)
- **zdf**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/7be14109a6bd493a2e881da4f9e30adaf3e7e5d5) ([#12779](https://github.com/yt-dlp/yt-dlp/issues/12779)) by [bashonly](https://github.com/bashonly), [InvalidUsernameException](https://github.com/InvalidUsernameException)
#### Downloader changes
- **niconicodmc**: [Remove downloader](https://github.com/yt-dlp/yt-dlp/commit/8d127b18f81131453eaba05d3bb810d9b73adb75) ([#12916](https://github.com/yt-dlp/yt-dlp/issues/12916)) by [doe1080](https://github.com/doe1080)
#### Networking changes
- [Add PATCH request shortcut](https://github.com/yt-dlp/yt-dlp/commit/ceab4d5ed63a1f135a1816fe967c9d9a1ec7e6e8) ([#12884](https://github.com/yt-dlp/yt-dlp/issues/12884)) by [doe1080](https://github.com/doe1080)
#### Misc. changes
- **ci**: [Add file mode test to code check](https://github.com/yt-dlp/yt-dlp/commit/3690e91265d1d0bbeffaf6a9b8cc9baded1367bd) ([#13036](https://github.com/yt-dlp/yt-dlp/issues/13036)) by [Grub4K](https://github.com/Grub4K)
- **cleanup**: Miscellaneous: [505b400](https://github.com/yt-dlp/yt-dlp/commit/505b400795af557bdcfd9d4fa7e9133b26ef431c) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2025.03.31
#### Core changes
- [Add `--compat-options 2024`](https://github.com/yt-dlp/yt-dlp/commit/22e34adbd741e1c7072015debd615dc3fb71c401) ([#12789](https://github.com/yt-dlp/yt-dlp/issues/12789)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- **francaisfacile**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/bb321cfdc3fd4400598ddb12a15862bc2ac8fc10) ([#12787](https://github.com/yt-dlp/yt-dlp/issues/12787)) by [mlabeeb03](https://github.com/mlabeeb03)
- **generic**: [Validate response before checking m3u8 live status](https://github.com/yt-dlp/yt-dlp/commit/9a1ec1d36e172d252714cef712a6d091e0a0c4f2) ([#12784](https://github.com/yt-dlp/yt-dlp/issues/12784)) by [bashonly](https://github.com/bashonly)
- **microsoftlearnepisode**: [Extract more formats](https://github.com/yt-dlp/yt-dlp/commit/d63696f23a341ee36a3237ccb5d5e14b34c2c579) ([#12799](https://github.com/yt-dlp/yt-dlp/issues/12799)) by [bashonly](https://github.com/bashonly)
- **mlbtv**: [Fix radio-only extraction](https://github.com/yt-dlp/yt-dlp/commit/f033d86b96b36f8c5289dd7c3304f42d4d9f6ff4) ([#12792](https://github.com/yt-dlp/yt-dlp/issues/12792)) by [bashonly](https://github.com/bashonly)
- **on24**: [Support `mainEvent` URLs](https://github.com/yt-dlp/yt-dlp/commit/e465b078ead75472fcb7b86f6ccaf2b5d3bc4c21) ([#12800](https://github.com/yt-dlp/yt-dlp/issues/12800)) by [bashonly](https://github.com/bashonly)
- **sbs**: [Fix subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/29560359120f28adaaac67c86fa8442eb72daa0d) ([#12785](https://github.com/yt-dlp/yt-dlp/issues/12785)) by [bashonly](https://github.com/bashonly)
- **stvr**: [Rename extractor from RTVS to STVR](https://github.com/yt-dlp/yt-dlp/commit/5fc521cbd0ce7b2410d0935369558838728e205d) ([#12788](https://github.com/yt-dlp/yt-dlp/issues/12788)) by [mireq](https://github.com/mireq)
- **twitch**: clips: [Extract portrait formats](https://github.com/yt-dlp/yt-dlp/commit/61046c31612b30c749cbdae934b7fe26abe659d7) ([#12763](https://github.com/yt-dlp/yt-dlp/issues/12763)) by [DmitryScaletta](https://github.com/DmitryScaletta)
- **youtube**
- [Add `player_js_variant` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/07f04005e40ebdb368920c511e36e98af0077ed3) ([#12767](https://github.com/yt-dlp/yt-dlp/issues/12767)) by [bashonly](https://github.com/bashonly)
- tab: [Fix playlist continuation extraction](https://github.com/yt-dlp/yt-dlp/commit/6a6d97b2cbc78f818de05cc96edcdcfd52caa259) ([#12777](https://github.com/yt-dlp/yt-dlp/issues/12777)) by [coletdjnz](https://github.com/coletdjnz)
#### Misc. changes
- **cleanup**: Miscellaneous: [5e457af](https://github.com/yt-dlp/yt-dlp/commit/5e457af57fae9645b1b8fa0ed689229c8fb9656b) by [bashonly](https://github.com/bashonly)
### 2025.03.27
#### Core changes
- **jsinterp**: [Fix nested attributes and object extraction](https://github.com/yt-dlp/yt-dlp/commit/a8b9ff3c2a0ae25735e580173becc78545b92572) ([#12760](https://github.com/yt-dlp/yt-dlp/issues/12760)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
#### Extractor changes
- **youtube**: [Make signature and nsig extraction more robust](https://github.com/yt-dlp/yt-dlp/commit/48be862b32648bff5b3e553e40fca4dcc6e88b28) ([#12761](https://github.com/yt-dlp/yt-dlp/issues/12761)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2025.03.26
#### Extractor changes
- **youtube**
- [Fix signature and nsig extraction for player `4fcd6e4a`](https://github.com/yt-dlp/yt-dlp/commit/a550dfc904a02843a26369ae50dbb7c0febfb30e) ([#12748](https://github.com/yt-dlp/yt-dlp/issues/12748)) by [seproDev](https://github.com/seproDev)
- [Only cache nsig code on successful decoding](https://github.com/yt-dlp/yt-dlp/commit/ecee97b4fa90d51c48f9154c3a6d5a8ffe46cd5c) ([#12750](https://github.com/yt-dlp/yt-dlp/issues/12750)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2025.03.25
#### Core changes
- [Fix attribute error on failed VT init](https://github.com/yt-dlp/yt-dlp/commit/b872ffec50fd50f790a5a490e006a369a28a3df3) ([#12696](https://github.com/yt-dlp/yt-dlp/issues/12696)) by [Grub4K](https://github.com/Grub4K)
- **utils**: `js_to_json`: [Make function less fatal](https://github.com/yt-dlp/yt-dlp/commit/9491b44032b330e05bd5eaa546187005d1e8538e) ([#12715](https://github.com/yt-dlp/yt-dlp/issues/12715)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- [Fix sorting of HLS audio formats by `GROUP-ID`](https://github.com/yt-dlp/yt-dlp/commit/86ab79e1a5182092321102adf6ca34195803b878) ([#12714](https://github.com/yt-dlp/yt-dlp/issues/12714)) by [bashonly](https://github.com/bashonly)
- **17live**: vod: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3396eb50dcd245b49c0f4aecd6e80ec914095d16) ([#12723](https://github.com/yt-dlp/yt-dlp/issues/12723)) by [subrat-lima](https://github.com/subrat-lima)
- **9now.com.au**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/9d5e6de2e7a47226d1f72c713ad45c88ba01db68) ([#12702](https://github.com/yt-dlp/yt-dlp/issues/12702)) by [bashonly](https://github.com/bashonly)
- **chzzk**: video: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/e2dfccaf808b406d5bcb7dd04ae9ce420752dd6f) ([#12692](https://github.com/yt-dlp/yt-dlp/issues/12692)) by [bashonly](https://github.com/bashonly), [dirkf](https://github.com/dirkf)
- **deezer**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/be5af3f9e91747768c2b41157851bfbe14c663f7) ([#12704](https://github.com/yt-dlp/yt-dlp/issues/12704)) by [seproDev](https://github.com/seproDev)
- **generic**: [Fix MPD base URL parsing](https://github.com/yt-dlp/yt-dlp/commit/5086d4aed6aeb3908c62f49e2d8f74cc0cb05110) ([#12718](https://github.com/yt-dlp/yt-dlp/issues/12718)) by [fireattack](https://github.com/fireattack)
- **streaks**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/801afeac91f97dc0b58cd39cc7e8c50f619dc4e1) ([#12679](https://github.com/yt-dlp/yt-dlp/issues/12679)) by [doe1080](https://github.com/doe1080)
- **tver**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/66e0bab814e4a52ef3e12d81123ad992a29df50e) ([#12659](https://github.com/yt-dlp/yt-dlp/issues/12659)) by [arabcoders](https://github.com/arabcoders), [bashonly](https://github.com/bashonly)
- **viki**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/fe4f14b8369038e7c58f7de546d76de1ce3a91ce) ([#12703](https://github.com/yt-dlp/yt-dlp/issues/12703)) by [seproDev](https://github.com/seproDev)
- **vrsquare**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/b7fbb5a0a16a8e8d3e29c29e26ebed677d0d6ea3) ([#12515](https://github.com/yt-dlp/yt-dlp/issues/12515)) by [doe1080](https://github.com/doe1080)
- **youtube**
- [Fix PhantomJS nsig fallback](https://github.com/yt-dlp/yt-dlp/commit/4054a2b623bd1e277b49d2e9abc3d112a4b1c7be) ([#12728](https://github.com/yt-dlp/yt-dlp/issues/12728)) by [bashonly](https://github.com/bashonly)
- [Fix signature and nsig extraction for player `363db69b`](https://github.com/yt-dlp/yt-dlp/commit/b9c979461b244713bf42691a5bc02834e2ba4b2c) ([#12725](https://github.com/yt-dlp/yt-dlp/issues/12725)) by [bashonly](https://github.com/bashonly)
#### Networking changes
- **Request Handler**: curl_cffi: [Support `curl_cffi` 0.10.x](https://github.com/yt-dlp/yt-dlp/commit/9bf23902ceb948b9685ce1dab575491571720fc6) ([#12670](https://github.com/yt-dlp/yt-dlp/issues/12670)) by [Grub4K](https://github.com/Grub4K)
#### Misc. changes
- **cleanup**: Miscellaneous: [9dde546](https://github.com/yt-dlp/yt-dlp/commit/9dde546e7ee3e1515d88ee3af08b099351455dc0) by [seproDev](https://github.com/seproDev)
### 2025.03.21
#### Core changes
- [Fix external downloader availability when using `--ffmpeg-location`](https://github.com/yt-dlp/yt-dlp/commit/9f77e04c76e36e1cbbf49bc9eb385fa6ef804b67) ([#12318](https://github.com/yt-dlp/yt-dlp/issues/12318)) by [Kenshin9977](https://github.com/Kenshin9977)
- [Load plugins on demand](https://github.com/yt-dlp/yt-dlp/commit/4445f37a7a66b248dbd8376c43137e6e441f138e) ([#11305](https://github.com/yt-dlp/yt-dlp/issues/11305)) by [coletdjnz](https://github.com/coletdjnz), [Grub4K](https://github.com/Grub4K), [pukkandan](https://github.com/pukkandan) (With fixes in [c034d65](https://github.com/yt-dlp/yt-dlp/commit/c034d655487be668222ef9476a16f374584e49a7))
- [Support emitting ConEmu progress codes](https://github.com/yt-dlp/yt-dlp/commit/f7a1f2d8132967a62b0f6d5665c6d2dde2d42c09) ([#10649](https://github.com/yt-dlp/yt-dlp/issues/10649)) by [Grub4K](https://github.com/Grub4K)
#### Extractor changes
- **azmedien**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/26a502fc727d0e91b2db6bf4a112823bcc672e85) ([#12375](https://github.com/yt-dlp/yt-dlp/issues/12375)) by [goggle](https://github.com/goggle)
- **bilibiliplaylist**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f5fb2229e66cf59d5bf16065bc041b42a28354a0) ([#12690](https://github.com/yt-dlp/yt-dlp/issues/12690)) by [bashonly](https://github.com/bashonly)
- **bunnycdn**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3a1583ca75fb523cbad0e5e174387ea7b477d175) ([#11586](https://github.com/yt-dlp/yt-dlp/issues/11586)) by [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
- **canalsurmas**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/01a8be4c23f186329d85f9c78db34a55f3294ac5) ([#12497](https://github.com/yt-dlp/yt-dlp/issues/12497)) by [Arc8ne](https://github.com/Arc8ne)
- **cda**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/be0d819e1103195043f6743650781f0d4d343f6d) ([#12552](https://github.com/yt-dlp/yt-dlp/issues/12552)) by [rysson](https://github.com/rysson)
- **cultureunplugged**: [Extend `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/3042afb5fe342d3a00de76704cd7de611acc350e) ([#12486](https://github.com/yt-dlp/yt-dlp/issues/12486)) by [seproDev](https://github.com/seproDev)
- **dailymotion**: [Improve embed detection](https://github.com/yt-dlp/yt-dlp/commit/ad60137c141efa5023fbc0ac8579eaefe8b3d8cc) ([#12464](https://github.com/yt-dlp/yt-dlp/issues/12464)) by [seproDev](https://github.com/seproDev)
- **gem.cbc.ca**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/eb1417786a3027b1e7290ec37ef6aaece50ebed0) ([#12414](https://github.com/yt-dlp/yt-dlp/issues/12414)) by [bashonly](https://github.com/bashonly)
- **globo**: [Fix subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/0e1697232fcbba7551f983fd1ba93bb445cbb08b) ([#12270](https://github.com/yt-dlp/yt-dlp/issues/12270)) by [pedro](https://github.com/pedro)
- **instagram**
- [Add `app_id` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/a90641c8363fa0c10800b36eb6b01ee22d3a9409) ([#12359](https://github.com/yt-dlp/yt-dlp/issues/12359)) by [chrisellsworth](https://github.com/chrisellsworth)
- [Fix extraction of older private posts](https://github.com/yt-dlp/yt-dlp/commit/a59abe0636dc49b22a67246afe35613571b86f05) ([#12451](https://github.com/yt-dlp/yt-dlp/issues/12451)) by [bashonly](https://github.com/bashonly)
- [Improve error handling](https://github.com/yt-dlp/yt-dlp/commit/480125560a3b9972d29ae0da850aba8109e6bd41) ([#12410](https://github.com/yt-dlp/yt-dlp/issues/12410)) by [bashonly](https://github.com/bashonly)
- story: [Support `--no-playlist`](https://github.com/yt-dlp/yt-dlp/commit/65c3c58c0a67463a150920203cec929045c95a24) ([#12397](https://github.com/yt-dlp/yt-dlp/issues/12397)) by [fireattack](https://github.com/fireattack)
- **jamendo**: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/89a68c4857ddbaf937ff22f12648baaf6b5af840) ([#12622](https://github.com/yt-dlp/yt-dlp/issues/12622)) by [bashonly](https://github.com/bashonly), [JChris246](https://github.com/JChris246)
- **ketnet**: [Remove extractor](https://github.com/yt-dlp/yt-dlp/commit/bbada3ec0779422cde34f1ce3dcf595da463b493) ([#12628](https://github.com/yt-dlp/yt-dlp/issues/12628)) by [MichaelDeBoey](https://github.com/MichaelDeBoey)
- **lbry**
- [Make m3u8 format extraction non-fatal](https://github.com/yt-dlp/yt-dlp/commit/9807181cfbf87bfa732f415c30412bdbd77cbf81) ([#12463](https://github.com/yt-dlp/yt-dlp/issues/12463)) by [bashonly](https://github.com/bashonly)
- [Raise appropriate error for non-media files](https://github.com/yt-dlp/yt-dlp/commit/7126b472601814b7fd8c9de02069e8fff1764891) ([#12462](https://github.com/yt-dlp/yt-dlp/issues/12462)) by [bashonly](https://github.com/bashonly)
- **loco**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/983095485c731240aae27c950cb8c24a50827b56) ([#12667](https://github.com/yt-dlp/yt-dlp/issues/12667)) by [DTrombett](https://github.com/DTrombett)
- **magellantv**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/172d5fcd778bf2605db7647ebc56b29ed18d24ac) ([#12505](https://github.com/yt-dlp/yt-dlp/issues/12505)) by [seproDev](https://github.com/seproDev)
- **mitele**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7223d29569a48a35ad132a508c115973866838d3) ([#12689](https://github.com/yt-dlp/yt-dlp/issues/12689)) by [bashonly](https://github.com/bashonly)
- **msn**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/4815dac131d42c51e12c1d05232db0bbbf607329) ([#12513](https://github.com/yt-dlp/yt-dlp/issues/12513)) by [seproDev](https://github.com/seproDev), [thedenv](https://github.com/thedenv)
- **n1**: [Fix extraction of newer articles](https://github.com/yt-dlp/yt-dlp/commit/9d70abe4de401175cbbaaa36017806f16b2df9af) ([#12514](https://github.com/yt-dlp/yt-dlp/issues/12514)) by [u-spec-png](https://github.com/u-spec-png)
- **nbcstations**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/ebac65aa9e0bf9a97c24d00f7977900d2577364b) ([#12534](https://github.com/yt-dlp/yt-dlp/issues/12534)) by [refack](https://github.com/refack)
- **niconico**
- [Fix format sorting](https://github.com/yt-dlp/yt-dlp/commit/7508e34f203e97389f1d04db92140b13401dd724) ([#12442](https://github.com/yt-dlp/yt-dlp/issues/12442)) by [xpadev-net](https://github.com/xpadev-net)
- live: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/c2e6e1d5f77f3b720a6266f2869eb750d20e5dc1) ([#12419](https://github.com/yt-dlp/yt-dlp/issues/12419)) by [bashonly](https://github.com/bashonly)
- **openrec**: [Fix `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/17504f253564cfad86244de2b6346d07d2300ca5) ([#12608](https://github.com/yt-dlp/yt-dlp/issues/12608)) by [fireattack](https://github.com/fireattack)
- **pinterest**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/bd0a66816934de70312eea1e71c59c13b401dc3a) ([#12538](https://github.com/yt-dlp/yt-dlp/issues/12538)) by [mikf](https://github.com/mikf)
- **playsuisse**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/6933f5670cea9c3e2fb16c1caa1eda54d13122c5) ([#12444](https://github.com/yt-dlp/yt-dlp/issues/12444)) by [bashonly](https://github.com/bashonly)
- **reddit**: [Truncate title](https://github.com/yt-dlp/yt-dlp/commit/d9a53cc1e6fd912daf500ca4f19e9ca88994dbf9) ([#12567](https://github.com/yt-dlp/yt-dlp/issues/12567)) by [seproDev](https://github.com/seproDev)
- **rtp**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/8eb9c1bf3b9908cca22ef043602aa24fb9f352c6) ([#11638](https://github.com/yt-dlp/yt-dlp/issues/11638)) by [pferreir](https://github.com/pferreir), [red-acid](https://github.com/red-acid), [seproDev](https://github.com/seproDev), [somini](https://github.com/somini), [vallovic](https://github.com/vallovic)
- **softwhiteunderbelly**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/652827d5a076c9483c36654ad2cf3fe46219baf4) ([#12281](https://github.com/yt-dlp/yt-dlp/issues/12281)) by [benfaerber](https://github.com/benfaerber)
- **soop**: [Fix timestamp extraction](https://github.com/yt-dlp/yt-dlp/commit/8305df00012ff8138a6ff95279d06b54ac607f63) ([#12609](https://github.com/yt-dlp/yt-dlp/issues/12609)) by [msikma](https://github.com/msikma)
- **soundcloud**
- [Extract tags](https://github.com/yt-dlp/yt-dlp/commit/9deed13d7cce6d3647379e50589c92de89227509) ([#12420](https://github.com/yt-dlp/yt-dlp/issues/12420)) by [bashonly](https://github.com/bashonly)
- [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/6deeda5c11f34f613724fa0627879f0d607ba1b4) ([#12447](https://github.com/yt-dlp/yt-dlp/issues/12447)) by [bashonly](https://github.com/bashonly)
- **tiktok**
- [Improve error handling](https://github.com/yt-dlp/yt-dlp/commit/99ea2978757a431eeb2a265b3395ccbe4ce202cf) ([#12445](https://github.com/yt-dlp/yt-dlp/issues/12445)) by [bashonly](https://github.com/bashonly)
- [Truncate title](https://github.com/yt-dlp/yt-dlp/commit/83b119dadb0f267f1fb66bf7ed74c097349de79e) ([#12566](https://github.com/yt-dlp/yt-dlp/issues/12566)) by [seproDev](https://github.com/seproDev)
- **tv8.it**: [Add live and playlist extractors](https://github.com/yt-dlp/yt-dlp/commit/2ee3a0aff9be2be3bea60640d3d8a0febaf0acb6) ([#12569](https://github.com/yt-dlp/yt-dlp/issues/12569)) by [DTrombett](https://github.com/DTrombett)
- **tvw**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/42b7440963866e31ff84a5b89030d1c596fa2e6e) ([#12271](https://github.com/yt-dlp/yt-dlp/issues/12271)) by [fries1234](https://github.com/fries1234)
- **twitter**
- [Fix syndication token generation](https://github.com/yt-dlp/yt-dlp/commit/b8b47547049f5ebc3dd680fc7de70ed0ca9c0d70) ([#12537](https://github.com/yt-dlp/yt-dlp/issues/12537)) by [bashonly](https://github.com/bashonly)
- [Truncate title](https://github.com/yt-dlp/yt-dlp/commit/06f6de78db2eceeabd062ab1a3023e0ff9d4df53) ([#12560](https://github.com/yt-dlp/yt-dlp/issues/12560)) by [seproDev](https://github.com/seproDev)
- **vk**: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/05c8023a27dd37c49163c0498bf98e3e3c1cb4b9) ([#12510](https://github.com/yt-dlp/yt-dlp/issues/12510)) by [seproDev](https://github.com/seproDev)
- **vrtmax**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/df9ebeec00d658693252978d1ffb885e67aa6ab6) ([#12479](https://github.com/yt-dlp/yt-dlp/issues/12479)) by [bergoid](https://github.com/bergoid), [MichaelDeBoey](https://github.com/MichaelDeBoey), [seproDev](https://github.com/seproDev)
- **weibo**: [Support playlists](https://github.com/yt-dlp/yt-dlp/commit/0bb39788626002a8a67e925580227952c563c8b9) ([#12284](https://github.com/yt-dlp/yt-dlp/issues/12284)) by [4ft35t](https://github.com/4ft35t)
- **wsj**: [Support opinion URLs and impersonation](https://github.com/yt-dlp/yt-dlp/commit/7f3006eb0c0659982bb956d71b0bc806bcb0a5f2) ([#12431](https://github.com/yt-dlp/yt-dlp/issues/12431)) by [refack](https://github.com/refack)
- **youtube**
- [Fix nsig and signature extraction for player `643afba4`](https://github.com/yt-dlp/yt-dlp/commit/9b868518a15599f3d7ef5a1c730dda164c30da9b) ([#12684](https://github.com/yt-dlp/yt-dlp/issues/12684)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/3380febe9984c21c79c3147c1d390a4cf339bc4c) ([#12603](https://github.com/yt-dlp/yt-dlp/issues/12603)) by [seproDev](https://github.com/seproDev)
- [Split into package](https://github.com/yt-dlp/yt-dlp/commit/4432a9390c79253ac830702b226d2e558b636725) ([#12557](https://github.com/yt-dlp/yt-dlp/issues/12557)) by [coletdjnz](https://github.com/coletdjnz)
- [Warn on DRM formats](https://github.com/yt-dlp/yt-dlp/commit/e67d786c7cc87bd449d22e0ddef08306891c1173) ([#12593](https://github.com/yt-dlp/yt-dlp/issues/12593)) by [coletdjnz](https://github.com/coletdjnz)
- [Warn on missing formats due to SSAP](https://github.com/yt-dlp/yt-dlp/commit/79ec2fdff75c8c1bb89b550266849ad4dec48dd3) ([#12483](https://github.com/yt-dlp/yt-dlp/issues/12483)) by [coletdjnz](https://github.com/coletdjnz)
#### Networking changes
- [Add `keep_header_casing` extension](https://github.com/yt-dlp/yt-dlp/commit/7d18fed8f1983fe6de4ddc810dfb2761ba5744ac) ([#11652](https://github.com/yt-dlp/yt-dlp/issues/11652)) by [coletdjnz](https://github.com/coletdjnz), [Grub4K](https://github.com/Grub4K)
- [Always add unsupported suffix on version mismatch](https://github.com/yt-dlp/yt-dlp/commit/95f8df2f796d0048119615200758199aedcd7cf4) ([#12626](https://github.com/yt-dlp/yt-dlp/issues/12626)) by [Grub4K](https://github.com/Grub4K)
#### Misc. changes
- **cleanup**: Miscellaneous: [f36e4b6](https://github.com/yt-dlp/yt-dlp/commit/f36e4b6e65cb8403791aae2f520697115cb88dec) by [dirkf](https://github.com/dirkf), [gamer191](https://github.com/gamer191), [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
- **test**: [Show all differences for `expect_value` and `expect_dict`](https://github.com/yt-dlp/yt-dlp/commit/a3e0c7d3b267abdf3933b709704a28d43bb46503) ([#12334](https://github.com/yt-dlp/yt-dlp/issues/12334)) by [Grub4K](https://github.com/Grub4K)
### 2025.02.19
#### Core changes

View File

@ -44,6 +44,7 @@
* [Post-processing Options](#post-processing-options)
* [SponsorBlock Options](#sponsorblock-options)
* [Extractor Options](#extractor-options)
* [Preset Aliases](#preset-aliases)
* [CONFIGURATION](#configuration)
* [Configuration file encoding](#configuration-file-encoding)
* [Authentication with netrc](#authentication-with-netrc)
@ -348,8 +349,8 @@ ## General Options:
--no-flat-playlist Fully extract the videos of a playlist
(default)
--live-from-start Download livestreams from the start.
Currently only supported for YouTube
(Experimental)
Currently experimental and only supported
for YouTube and Twitch
--no-live-from-start Download livestreams from the current time
(default)
--wait-for-video MIN[-MAX] Wait for scheduled streams to become
@ -375,17 +376,23 @@ ## General Options:
an alias starts with a dash "-", it is
prefixed with "--". Arguments are parsed
according to the Python string formatting
mini-language. E.g. --alias get-audio,-X
"-S=aext:{0},abr -x --audio-format {0}"
creates options "--get-audio" and "-X" that
takes an argument (ARG0) and expands to
"-S=aext:ARG0,abr -x --audio-format ARG0".
All defined aliases are listed in the --help
mini-language. E.g. --alias get-audio,-X "-S
aext:{0},abr -x --audio-format {0}" creates
options "--get-audio" and "-X" that takes an
argument (ARG0) and expands to "-S
aext:ARG0,abr -x --audio-format ARG0". All
defined aliases are listed in the --help
output. Alias options can trigger more
aliases; so be careful to avoid defining
recursive options. As a safety measure, each
alias may be triggered a maximum of 100
times. This option can be used multiple times
-t, --preset-alias PRESET Applies a predefined set of options. e.g.
--preset-alias mp3. The following presets
are available: mp3, aac, mp4, mkv, sleep.
See the "Preset Aliases" section at the end
for more info. This option can be used
multiple times
## Network Options:
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy. To
@ -1098,6 +1105,27 @@ ## Extractor Options:
can use this option multiple times to give
arguments for different extractors
## Preset Aliases:
Predefined aliases for convenience and ease of use. Note that future
versions of yt-dlp may add or adjust presets, but the existing preset
names will not be changed or removed
-t mp3 -f 'ba[acodec^=mp3]/ba/b' -x --audio-format
mp3
-t aac -f
'ba[acodec^=aac]/ba[acodec^=mp4a.40.]/ba/b'
-x --audio-format aac
-t mp4 --merge-output-format mp4 --remux-video mp4
-S vcodec:h264,lang,quality,res,fps,hdr:12,a
codec:aac
-t mkv --merge-output-format mkv --remux-video mkv
-t sleep --sleep-subtitles 5 --sleep-requests 0.75
--sleep-interval 10 --max-sleep-interval 20
# CONFIGURATION
You can configure yt-dlp by placing any supported command line option in a configuration file. The configuration is loaded from the following locations:
@ -1769,9 +1797,10 @@ # EXTRACTOR ARGUMENTS
#### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `web_embedded` client is added for age-restricted videos but only works if the video is embeddable. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player), `initial_data` (skip initial data/next ep request). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause issues such as missing formats or metadata. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) and [#12826](https://github.com/yt-dlp/yt-dlp/issues/12826) for more details
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `player_js_variant`: The player javascript variant to use for signature and nsig deciphering. The known variants are: `main`, `tce`, `tv`, `tv_es6`, `phone`, `tablet`. Only `main` is recommended as a possible workaround; the others are for debugging purposes. The default is to use what is prescribed by the site, and can be selected with `actual`
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
@ -1781,7 +1810,12 @@ #### youtube
* `raise_incomplete_data`: `Incomplete Data Received` raises an error instead of reporting a warning
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be either `gvs` (Google Video Server URLs) or `player` (Innertube player request)
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be any of `gvs` (Google Video Server URLs), `player` (Innertube player request) or `subs` (Subtitles)
* `pot_trace`: Enable debug logging for PO Token fetching. Either `true` or `false` (default)
* `fetch_pot`: Policy to use for fetching a PO Token from providers. One of `always` (always try fetch a PO Token regardless if the client requires one for the given context), `never` (never fetch a PO Token), or `auto` (default; only fetch a PO Token if the client requires one for the given context)
#### youtubepot-webpo
* `bind_to_visitor_id`: Whether to use the Visitor ID instead of Visitor Data for caching WebPO tokens. Either `true` (default) or `false`
#### youtubetab (YouTube playlists, channels, feeds, etc.)
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
@ -1798,9 +1832,6 @@ #### generic
#### vikichannel
* `video_types`: Types of videos to download - one or more of `episodes`, `movies`, `clips`, `trailers`
#### niconico
* `segment_duration`: Segment duration in milliseconds for HLS-DMC formats. Use it at your own risk since this feature **may result in your account termination.**
#### youtubewebarchive
* `check_all`: Try to check more at the cost of more requests. One or more of `thumbnails`, `captures`
@ -1866,6 +1897,9 @@ #### bilibili
#### sonylivseries
* `sort_order`: Episode sort order for series extraction - one of `asc` (ascending, oldest first) or `desc` (descending, newest first). Default is `asc`
#### tver
* `backend`: Backend API to use for extraction - one of `streaks` (default) or `brightcove` (deprecated)
**Note**: These options may be changed/removed in the future without concern for backward compatibility
<!-- MANPAGE: MOVE "INSTALLATION" SECTION HERE -->
@ -2149,7 +2183,7 @@ ### New features
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection than what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. Note that NicoNico livestreams are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **YouTube improvements**:
* Supports Clips, Stories (`ytstories:<channel UCID>`), Search (including filters)**\***, YouTube Music Search, Channel-specific search, Search prefixes (`ytsearch:`, `ytsearchdate:`)**\***, Mixes, and Feeds (`:ytfav`, `:ytwatchlater`, `:ytsubs`, `:ythistory`, `:ytrec`, `:ytnotif`)
@ -2215,7 +2249,7 @@ ### Differences in default behavior
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
* YouTube channel URLs download all uploads of the channel. To download only the videos in a specific tab, pass the tab's URL. If the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
* Unavailable videos are also listed for YouTube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
* The upload dates extracted from YouTube are in UTC [when available](https://github.com/yt-dlp/yt-dlp/blob/89e4d86171c7b7c997c77d4714542e0383bf0db0/yt_dlp/extractor/youtube.py#L3898-L3900). Use `--compat-options no-youtube-prefer-utc-upload-date` to prefer the non-UTC upload date.
* The upload dates extracted from YouTube are in UTC.
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
* Some internal metadata such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
@ -2234,9 +2268,10 @@ ### Differences in default behavior
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization,no-youtube-prefer-utc-upload-date`
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization`
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
* `--compat-options 2023`: Same as `--compat-options prefer-vp9-sort`. Use this to enable all future compat options
* `--compat-options 2023`: Same as `--compat-options 2024,prefer-vp9-sort`
* `--compat-options 2024`: Currently does nothing. Use this to enable all future compat options
The following compat options restore vulnerable behavior from before security patches:

View File

@ -2,6 +2,7 @@
set -e
source ~/.local/share/pipx/venvs/pyinstaller/bin/activate
python -m devscripts.install_deps -o --include build
python -m devscripts.install_deps --include secretstorage --include curl-cffi
python -m devscripts.make_lazy_extractors
python devscripts/update-version.py -c "${channel}" -r "${origin}" "${version}"

View File

@ -36,6 +36,9 @@ def main():
f'--name={name}',
'--icon=devscripts/logo.ico',
'--upx-exclude=vcruntime140.dll',
# Ref: https://github.com/yt-dlp/yt-dlp/issues/13311
# https://github.com/pyinstaller/pyinstaller/issues/9149
'--exclude-module=pkg_resources',
'--noconfirm',
'--additional-hooks-dir=yt_dlp/__pyinstaller',
*opts,

View File

@ -245,5 +245,14 @@
"when": "76ac023ff02f06e8c003d104f02a03deeddebdcd",
"short": "[ie/youtube:tab] Improve shorts title extraction (#11997)",
"authors": ["bashonly", "d3d9"]
},
{
"action": "add",
"when": "88eb1e7a9a2720ac89d653c0d0e40292388823bb",
"short": "[priority] **New option `--preset-alias`/`-t` has been added**\nThis provides convenient predefined aliases for common use cases. Available presets include `mp4`, `mp3`, `mkv`, `aac`, and `sleep`. See [the README](https://github.com/yt-dlp/yt-dlp/blob/master/README.md#preset-aliases) for more details."
},
{
"action": "remove",
"when": "d596824c2f8428362c072518856065070616e348"
}
]

View File

@ -55,8 +55,7 @@ default = [
"websockets>=13.0",
]
curl-cffi = [
"curl-cffi==0.5.10; os_name=='nt' and implementation_name=='cpython'",
"curl-cffi>=0.5.10,!=0.6.*,<0.7.2; os_name!='nt' and implementation_name=='cpython'",
"curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.11; implementation_name=='cpython'",
]
secretstorage = [
"cffi",
@ -66,7 +65,7 @@ build = [
"build",
"hatchling",
"pip",
"setuptools>=71.0.2", # 71.0.0 broke pyinstaller
"setuptools>=71.0.2,<81", # See https://github.com/pyinstaller/pyinstaller/issues/9149
"wheel",
]
dev = [
@ -76,14 +75,14 @@ dev = [
]
static-analysis = [
"autopep8~=2.0",
"ruff~=0.9.0",
"ruff~=0.11.0",
]
test = [
"pytest~=8.1",
"pytest-rerunfailures~=14.0",
]
pyinstaller = [
"pyinstaller>=6.11.1", # Windows temp cleanup fixed in 6.11.1
"pyinstaller>=6.13.0", # Windows temp cleanup fixed in 6.13.0
]
[project.urls]
@ -387,7 +386,11 @@ select = [
exclude = "*/extractor/lazy_extractors.py,*venv*,*/test/testdata/sigs/player-*.js,.idea,.vscode"
[tool.pytest.ini_options]
addopts = "-ra -v --strict-markers"
addopts = [
"-ra", # summary: all except passed
"--verbose",
"--strict-markers",
]
markers = [
"download",
]

View File

@ -7,6 +7,7 @@ # Supported sites
- **17live**
- **17live:clip**
- **17live:vod**
- **1News**: 1news.co.nz article videos
- **1tv**: Первый канал
- **20min**
@ -200,7 +201,7 @@ # Supported sites
- **blogger.com**
- **Bloomberg**
- **Bluesky**
- **BokeCC**
- **BokeCC**: CC视频
- **BongaCams**
- **Boosty**
- **BostonGlobe**
@ -224,6 +225,7 @@ # Supported sites
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen
- **Bundesliga**
- **Bundestag**
- **BunnyCdn**
- **BusinessInsider**
- **BuzzFeed**
- **BYUtv**: (**Currently broken**)
@ -242,8 +244,8 @@ # Supported sites
- **CanalAlpha**
- **canalc2.tv**
- **Canalplus**: mycanal.fr and piwiplus.fr
- **Canalsurmas**
- **CaracolTvPlay**: [*caracoltv-play*](## "netrc machine")
- **CartoonNetwork**
- **cbc.ca**
- **cbc.ca:player**
- **cbc.ca:player:playlist**
@ -345,8 +347,6 @@ # Supported sites
- **daystar:clip**
- **DBTV**
- **DctpTv**
- **DeezerAlbum**
- **DeezerPlaylist**
- **democracynow**
- **DestinationAmerica**
- **DetikEmbed**
@ -393,6 +393,8 @@ # Supported sites
- **dvtv**: http://video.aktualne.cz/
- **dw**: (**Currently broken**)
- **dw:article**: (**Currently broken**)
- **dzen.ru**: Дзен (dzen) formerly Яндекс.Дзен (Yandex Zen)
- **dzen.ru:channel**
- **EaglePlatform**
- **EbaumsWorld**
- **Ebay**
@ -471,6 +473,7 @@ # Supported sites
- **FoxNewsVideo**
- **FoxSports**
- **fptplay**: fptplay.vn
- **FrancaisFacile**
- **FranceCulture**
- **FranceInter**
- **francetv**
@ -609,10 +612,10 @@ # Supported sites
- **Inc**
- **IndavideoEmbed**
- **InfoQ**
- **Instagram**: [*instagram*](## "netrc machine")
- **instagram:story**: [*instagram*](## "netrc machine")
- **instagram:tag**: [*instagram*](## "netrc machine") Instagram hashtag search URLs
- **instagram:user**: [*instagram*](## "netrc machine") Instagram user profile (**Currently broken**)
- **Instagram**
- **instagram:story**
- **instagram:tag**: Instagram hashtag search URLs
- **instagram:user**: Instagram user profile (**Currently broken**)
- **InstagramIOS**: IOS instagram:// URL
- **Internazionale**
- **InternetVideoArchive**
@ -632,6 +635,7 @@ # Supported sites
- **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations
- **ivideon**: Ivideon TV
- **Ivoox**
- **IVXPlayer**
- **iwara**: [*iwara*](## "netrc machine")
- **iwara:playlist**: [*iwara*](## "netrc machine")
@ -644,7 +648,10 @@ # Supported sites
- **jiocinema**: [*jiocinema*](## "netrc machine")
- **jiocinema:series**: [*jiocinema*](## "netrc machine")
- **jiosaavn:album**
- **jiosaavn:artist**
- **jiosaavn:playlist**
- **jiosaavn:show**
- **jiosaavn:show:playlist**
- **jiosaavn:song**
- **Joj**
- **JoqrAg**: 超!A&G+ 文化放送 (f.k.a. AGQR) Nippon Cultural Broadcasting, Inc. (JOQR)
@ -661,7 +668,6 @@ # Supported sites
- **KelbyOne**: (**Currently broken**)
- **Kenh14Playlist**
- **Kenh14Video**
- **Ketnet**
- **khanacademy**
- **khanacademy:unit**
- **kick:clips**
@ -670,6 +676,7 @@ # Supported sites
- **Kicker**
- **KickStarter**
- **Kika**: KiKA.de
- **KikaPlaylist**
- **kinja:embed**
- **KinoPoisk**
- **Kommunetv**
@ -722,6 +729,7 @@ # Supported sites
- **limelight:channel**
- **limelight:channel_list**
- **LinkedIn**: [*linkedin*](## "netrc machine")
- **linkedin:events**: [*linkedin*](## "netrc machine")
- **linkedin:learning**: [*linkedin*](## "netrc machine")
- **linkedin:learning:course**: [*linkedin*](## "netrc machine")
- **Liputan6**
@ -733,9 +741,11 @@ # Supported sites
- **Livestreamfails**
- **Lnk**
- **loc**: Library of Congress
- **Loco**
- **loom**
- **loom:folder**
- **LoveHomePorn**
- **LRTRadio**
- **LRTStream**
- **LRTVOD**
- **LSMLREmbed**
@ -757,7 +767,7 @@ # Supported sites
- **ManotoTV**: Manoto TV (Episode)
- **ManotoTVLive**: Manoto TV (Live)
- **ManotoTVShow**: Manoto TV (Show)
- **ManyVids**: (**Currently broken**)
- **ManyVids**
- **MaoriTV**
- **Markiza**: (**Currently broken**)
- **MarkizaPage**: (**Currently broken**)
@ -827,11 +837,11 @@ # Supported sites
- **MotherlessUploader**
- **Motorsport**: motorsport.com (**Currently broken**)
- **MovieFap**
- **Moviepilot**
- **moviepilot**: Moviepilot trailer
- **MoviewPlay**
- **Moviezine**
- **MovingImage**
- **MSN**: (**Currently broken**)
- **MSN**
- **mtg**: MTG services
- **mtv**
- **mtv.de**: (**Currently broken**)
@ -944,7 +954,7 @@ # Supported sites
- **nickelodeonru**
- **niconico**: [*niconico*](## "netrc machine") ニコニコ動画
- **niconico:history**: NicoNico user history or likes. Requires cookies.
- **niconico:live**: ニコニコ生放送
- **niconico:live**: [*niconico*](## "netrc machine") ニコニコ生放送
- **niconico:playlist**
- **niconico:series**
- **niconico:tag**: NicoNico video tag URLs
@ -1051,6 +1061,8 @@ # Supported sites
- **Parler**: Posts on parler.com
- **parliamentlive.tv**: UK parliament videos
- **Parlview**: (**Currently broken**)
- **parti:livestream**
- **parti:video**
- **patreon**
- **patreon:campaign**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
@ -1071,8 +1083,8 @@ # Supported sites
- **Photobucket**
- **PiaLive**
- **Piapro**: [*piapro*](## "netrc machine")
- **Picarto**
- **PicartoVod**
- **picarto**
- **picarto:vod**
- **Piksel**
- **Pinkbike**
- **Pinterest**
@ -1225,6 +1237,7 @@ # Supported sites
- **RoosterTeeth**: [*roosterteeth*](## "netrc machine")
- **RoosterTeethSeries**: [*roosterteeth*](## "netrc machine")
- **RottenTomatoes**
- **RoyaLive**
- **Rozhlas**
- **RozhlasVltava**
- **RTBF**: [*rtbf*](## "netrc machine") (**Currently broken**)
@ -1245,12 +1258,10 @@ # Supported sites
- **RTVCKaltura**
- **RTVCPlay**
- **RTVCPlayEmbed**
- **rtve.es:alacarta**: RTVE a la carta
- **rtve.es:alacarta**: RTVE a la carta and Play
- **rtve.es:audio**: RTVE audio
- **rtve.es:infantil**: RTVE infantil
- **rtve.es:live**: RTVE.es live streams
- **rtve.es:television**
- **RTVS**
- **rtvslo.si**
- **rtvslo.si:show**
- **RudoVideo**
@ -1305,8 +1316,8 @@ # Supported sites
- **sejm**
- **Sen**
- **SenalColombiaLive**: (**Currently broken**)
- **SenateGov**
- **SenateISVP**
- **senate.gov**
- **senate.gov:isvp**
- **SendtoNews**: (**Currently broken**)
- **Servus**
- **Sexu**: (**Currently broken**)
@ -1342,6 +1353,7 @@ # Supported sites
- **Smotrim**
- **SnapchatSpotlight**
- **Snotr**
- **SoftWhiteUnderbelly**: [*softwhiteunderbelly*](## "netrc machine")
- **Sohu**
- **SohuV**
- **SonyLIV**: [*sonyliv*](## "netrc machine")
@ -1380,7 +1392,6 @@ # Supported sites
- **Spreaker**
- **SpreakerShow**
- **SpringboardPlatform**
- **Sprout**
- **SproutVideo**
- **sr:mediathek**: Saarländischer Rundfunk (**Currently broken**)
- **SRGSSR**
@ -1398,12 +1409,14 @@ # Supported sites
- **StoryFire**
- **StoryFireSeries**
- **StoryFireUser**
- **Streaks**
- **Streamable**
- **StreamCZ**
- **StreetVoice**
- **StretchInternet**
- **Stripchat**
- **stv:player**
- **stvr**: Slovak Television and Radio (formerly RTVS)
- **Subsplash**
- **subsplash:playlist**
- **Substack**
@ -1536,6 +1549,8 @@ # Supported sites
- **tv5unis**
- **tv5unis:video**
- **tv8.it**
- **tv8.it:live**: TV8 Live
- **tv8.it:playlist**: TV8 Playlist
- **TVANouvelles**
- **TVANouvellesArticle**
- **tvaplus**: TVA+
@ -1556,6 +1571,8 @@ # Supported sites
- **tvp:vod:series**
- **TVPlayer**
- **TVPlayHome**
- **tvw**
- **tvw:tvchannels**
- **Tweakers**
- **TwitCasting**
- **TwitCastingLive**
@ -1637,11 +1654,10 @@ # Supported sites
- **viewlift**
- **viewlift:embed**
- **Viidea**
- **viki**: [*viki*](## "netrc machine")
- **viki:channel**: [*viki*](## "netrc machine")
- **vimeo**: [*vimeo*](## "netrc machine")
- **vimeo:album**: [*vimeo*](## "netrc machine")
- **vimeo:channel**: [*vimeo*](## "netrc machine")
- **vimeo:event**: [*vimeo*](## "netrc machine")
- **vimeo:group**: [*vimeo*](## "netrc machine")
- **vimeo:likes**: [*vimeo*](## "netrc machine") Vimeo user likes
- **vimeo:ondemand**: [*vimeo*](## "netrc machine")
@ -1676,8 +1692,12 @@ # Supported sites
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **vqq:series**
- **vqq:video**
- **vrsquare**: VR SQUARE
- **vrsquare:channel**
- **vrsquare:search**
- **vrsquare:section**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **VrtNU**: [*vrtnu*](## "netrc machine") VRT MAX
- **vrtmax**: [*vrtnu*](## "netrc machine") VRT MAX (formerly VRT NU)
- **VTM**: (**Currently broken**)
- **VTV**
- **VTVGo**
@ -1812,14 +1832,12 @@ # Supported sites
- **ZattooLive**: [*zattoo*](## "netrc machine")
- **ZattooMovies**: [*zattoo*](## "netrc machine")
- **ZattooRecordings**: [*zattoo*](## "netrc machine")
- **ZDF**
- **ZDFChannel**
- **zdf**
- **zdf:channel**
- **Zee5**: [*zee5*](## "netrc machine")
- **zee5:series**
- **ZeeNews**: (**Currently broken**)
- **ZenPorn**
- **ZenYandex**
- **ZenYandexChannel**
- **ZetlandDKArticle**
- **Zhihu**
- **zingmp3**: zingmp3.vn

View File

@ -136,7 +136,7 @@ def _iter_differences(got, expected, field):
return
if op == 'startswith':
if not val.startswith(got):
if not got.startswith(val):
yield field, f'should start with {val!r}, got {got!r}'
return

View File

@ -638,6 +638,7 @@ def test_parse_m3u8_formats(self):
'img_bipbop_adv_example_fmp4',
'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
[{
# 60kbps (bitrate not provided in m3u8); sorted as worst because it's grouped with lowest bitrate video track
'format_id': 'aud1-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a1/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
@ -645,15 +646,9 @@ def test_parse_m3u8_formats(self):
'ext': 'mp4',
'protocol': 'm3u8_native',
'audio_ext': 'mp4',
'source_preference': 0,
}, {
'format_id': 'aud2-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a2/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
'language': 'en',
'ext': 'mp4',
'protocol': 'm3u8_native',
'audio_ext': 'mp4',
}, {
# 192kbps (bitrate not provided in m3u8)
'format_id': 'aud3-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a3/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
@ -661,6 +656,17 @@ def test_parse_m3u8_formats(self):
'ext': 'mp4',
'protocol': 'm3u8_native',
'audio_ext': 'mp4',
'source_preference': 1,
}, {
# 384kbps (bitrate not provided in m3u8); sorted as best because it's grouped with the highest bitrate video track
'format_id': 'aud2-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a2/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
'language': 'en',
'ext': 'mp4',
'protocol': 'm3u8_native',
'audio_ext': 'mp4',
'source_preference': 2,
}, {
'format_id': '530',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/v2/prog_index.m3u8',

View File

@ -1435,6 +1435,27 @@ def test_load_plugins_compat(self):
FakeYDL().close()
assert all_plugins_loaded.value
def test_close_hooks(self):
# Should call all registered close hooks on close
close_hook_called = False
close_hook_two_called = False
def close_hook():
nonlocal close_hook_called
close_hook_called = True
def close_hook_two():
nonlocal close_hook_two_called
close_hook_two_called = True
ydl = FakeYDL()
ydl.add_close_hook(close_hook)
ydl.add_close_hook(close_hook_two)
ydl.close()
self.assertTrue(close_hook_called, 'Close hook was not called')
self.assertTrue(close_hook_two_called, 'Close hook two was not called')
if __name__ == '__main__':
unittest.main()

View File

@ -58,6 +58,14 @@ def test_get_desktop_environment(self):
({'DESKTOP_SESSION': 'kde'}, _LinuxDesktopEnvironment.KDE3),
({'DESKTOP_SESSION': 'xfce'}, _LinuxDesktopEnvironment.XFCE),
({'XDG_CURRENT_DESKTOP': 'my_custom_de', 'DESKTOP_SESSION': 'gnome'}, _LinuxDesktopEnvironment.GNOME),
({'XDG_CURRENT_DESKTOP': 'my_custom_de', 'DESKTOP_SESSION': 'mate'}, _LinuxDesktopEnvironment.GNOME),
({'XDG_CURRENT_DESKTOP': 'my_custom_de', 'DESKTOP_SESSION': 'kde4'}, _LinuxDesktopEnvironment.KDE4),
({'XDG_CURRENT_DESKTOP': 'my_custom_de', 'DESKTOP_SESSION': 'kde'}, _LinuxDesktopEnvironment.KDE3),
({'XDG_CURRENT_DESKTOP': 'my_custom_de', 'DESKTOP_SESSION': 'xfce'}, _LinuxDesktopEnvironment.XFCE),
({'XDG_CURRENT_DESKTOP': 'my_custom_de', 'DESKTOP_SESSION': 'my_custom_de', 'GNOME_DESKTOP_SESSION_ID': 1}, _LinuxDesktopEnvironment.GNOME),
({'GNOME_DESKTOP_SESSION_ID': 1}, _LinuxDesktopEnvironment.GNOME),
({'KDE_FULL_SESSION': 1}, _LinuxDesktopEnvironment.KDE3),
({'KDE_FULL_SESSION': 1, 'DESKTOP_SESSION': 'kde4'}, _LinuxDesktopEnvironment.KDE4),

View File

@ -331,10 +331,6 @@ def test_http_connect_auth(self, handler, ctx):
assert proxy_info['proxy'] == server_address
assert 'Proxy-Authorization' in proxy_info['headers']
@pytest.mark.skip_handler(
'Requests',
'bug in urllib3 causes unclosed socket: https://github.com/urllib3/urllib3/issues/3374',
)
def test_http_connect_bad_auth(self, handler, ctx):
with ctx.http_server(HTTPConnectProxyHandler, username='test', password='test') as server_address:
with handler(verify=False, proxies={ctx.REQUEST_PROTO: f'http://test:bad@{server_address}'}) as rh:

View File

@ -118,6 +118,7 @@ def test_assignments(self):
self._test('function f(){var x = 20; x = 30 + 1; return x;}', 31)
self._test('function f(){var x = 20; x += 30 + 1; return x;}', 51)
self._test('function f(){var x = 20; x -= 30 + 1; return x;}', -11)
self._test('function f(){var x = 2; var y = ["a", "b"]; y[x%y["length"]]="z"; return y}', ['z', 'b'])
@unittest.skip('Not implemented')
def test_comments(self):
@ -384,7 +385,7 @@ def test_negative(self):
@unittest.skip('Not implemented')
def test_packed(self):
jsi = JSInterpreter('''function f(p,a,c,k,e,d){while(c--)if(k[c])p=p.replace(new RegExp('\\b'+c.toString(a)+'\\b','g'),k[c]);return p}''')
self.assertEqual(jsi.call_function('f', '''h 7=g("1j");7.7h({7g:[{33:"w://7f-7e-7d-7c.v.7b/7a/79/78/77/76.74?t=73&s=2s&e=72&f=2t&71=70.0.0.1&6z=6y&6x=6w"}],6v:"w://32.v.u/6u.31",16:"r%",15:"r%",6t:"6s",6r:"",6q:"l",6p:"l",6o:"6n",6m:\'6l\',6k:"6j",9:[{33:"/2u?b=6i&n=50&6h=w://32.v.u/6g.31",6f:"6e"}],1y:{6d:1,6c:\'#6b\',6a:\'#69\',68:"67",66:30,65:r,},"64":{63:"%62 2m%m%61%5z%5y%5x.u%5w%5v%5u.2y%22 2k%m%1o%22 5t%m%1o%22 5s%m%1o%22 2j%m%5r%22 16%m%5q%22 15%m%5p%22 5o%2z%5n%5m%2z",5l:"w://v.u/d/1k/5k.2y",5j:[]},\'5i\':{"5h":"5g"},5f:"5e",5d:"w://v.u",5c:{},5b:l,1x:[0.25,0.50,0.75,1,1.25,1.5,2]});h 1m,1n,5a;h 59=0,58=0;h 7=g("1j");h 2x=0,57=0,56=0;$.55({54:{\'53-52\':\'2i-51\'}});7.j(\'4z\',6(x){c(5>0&&x.1l>=5&&1n!=1){1n=1;$(\'q.4y\').4x(\'4w\')}});7.j(\'13\',6(x){2x=x.1l});7.j(\'2g\',6(x){2w(x)});7.j(\'4v\',6(){$(\'q.2v\').4u()});6 2w(x){$(\'q.2v\').4t();c(1m)19;1m=1;17=0;c(4s.4r===l){17=1}$.4q(\'/2u?b=4p&2l=1k&4o=2t-4n-4m-2s-4l&4k=&4j=&4i=&17=\'+17,6(2r){$(\'#4h\').4g(2r)});$(\'.3-8-4f-4e:4d("4c")\').2h(6(e){2q();g().4b(0);g().4a(l)});6 2q(){h $14=$("<q />").2p({1l:"49",16:"r%",15:"r%",48:0,2n:0,2o:47,46:"45(10%, 10%, 10%, 0.4)","44-43":"42"});$("<41 />").2p({16:"60%",15:"60%",2o:40,"3z-2n":"3y"}).3x({\'2m\':\'/?b=3w&2l=1k\',\'2k\':\'0\',\'2j\':\'2i\'}).2f($14);$14.2h(6(){$(3v).3u();g().2g()});$14.2f($(\'#1j\'))}g().13(0);}6 3t(){h 9=7.1b(2e);2d.2c(9);c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==2e){2d.2c(\'!!=\'+i);7.1p(i)}}}}7.j(\'3s\',6(){g().1h("/2a/3r.29","3q 10 28",6(){g().13(g().27()+10)},"2b");$("q[26=2b]").23().21(\'.3-20-1z\');g().1h("/2a/3p.29","3o 10 28",6(){h 12=g().27()-10;c(12<0)12=0;g().13(12)},"24");$("q[26=24]").23().21(\'.3-20-1z\');});6 1i(){}7.j(\'3n\',6(){1i()});7.j(\'3m\',6(){1i()});7.j("k",6(y){h 9=7.1b();c(9.n<2)19;$(\'.3-8-3l-3k\').3j(6(){$(\'#3-8-a-k\').1e(\'3-8-a-z\');$(\'.3-a-k\').p(\'o-1f\',\'11\')});7.1h("/3i/3h.3g","3f 3e",6(){$(\'.3-1w\').3d(\'3-8-1v\');$(\'.3-8-1y, .3-8-1x\').p(\'o-1g\',\'11\');c($(\'.3-1w\').3c(\'3-8-1v\')){$(\'.3-a-k\').p(\'o-1g\',\'l\');$(\'.3-a-k\').p(\'o-1f\',\'l\');$(\'.3-8-a\').1e(\'3-8-a-z\');$(\'.3-8-a:1u\').3b(\'3-8-a-z\')}3a{$(\'.3-a-k\').p(\'o-1g\',\'11\');$(\'.3-a-k\').p(\'o-1f\',\'11\');$(\'.3-8-a:1u\').1e(\'3-8-a-z\')}},"39");7.j("38",6(y){1d.37(\'1c\',y.9[y.36].1a)});c(1d.1t(\'1c\')){35("1s(1d.1t(\'1c\'));",34)}});h 18;6 1s(1q){h 9=7.1b();c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==1q){c(i==18){19}18=i;7.1p(i)}}}}',36,270,'|||jw|||function|player|settings|tracks|submenu||if||||jwplayer|var||on|audioTracks|true|3D|length|aria|attr|div|100|||sx|filemoon|https||event|active||false|tt|seek|dd|height|width|adb|current_audio|return|name|getAudioTracks|default_audio|localStorage|removeClass|expanded|checked|addButton|callMeMaybe|vplayer|0fxcyc2ajhp1|position|vvplay|vvad|220|setCurrentAudioTrack|audio_name|for|audio_set|getItem|last|open|controls|playbackRates|captions|rewind|icon|insertAfter||detach|ff00||button|getPosition|sec|png|player8|ff11|log|console|track_name|appendTo|play|click|no|scrolling|frameborder|file_code|src|top|zIndex|css|showCCform|data|1662367683|383371|dl|video_ad|doPlay|prevt|mp4|3E||jpg|thumbs|file|300|setTimeout|currentTrack|setItem|audioTrackChanged|dualSound|else|addClass|hasClass|toggleClass|Track|Audio|svg|dualy|images|mousedown|buttons|topbar|playAttemptFailed|beforePlay|Rewind|fr|Forward|ff|ready|set_audio_track|remove|this|upload_srt|prop|50px|margin|1000001|iframe|center|align|text|rgba|background|1000000|left|absolute|pause|setCurrentCaptions|Upload|contains|item|content|html|fviews|referer|prem|embed|3e57249ef633e0d03bf76ceb8d8a4b65|216|83|hash|view|get|TokenZir|window|hide|show|complete|slow|fadeIn|video_ad_fadein|time||cache|Cache|Content|headers|ajaxSetup|v2done|tott|vastdone2|vastdone1|vvbefore|playbackRateControls|cast|aboutlink|FileMoon|abouttext|UHD|1870|qualityLabels|sites|GNOME_POWER|link|2Fiframe|3C|allowfullscreen|22360|22640|22no|marginheight|marginwidth|2FGNOME_POWER|2F0fxcyc2ajhp1|2Fe|2Ffilemoon|2F|3A||22https|3Ciframe|code|sharing|fontOpacity|backgroundOpacity|Tahoma|fontFamily|303030|backgroundColor|FFFFFF|color|userFontScale|thumbnails|kind|0fxcyc2ajhp10000|url|get_slides|start|startparam|none|preload|html5|primary|hlshtml|androidhls|duration|uniform|stretching|0fxcyc2ajhp1_xt|image|2048|sp|6871|asn|127|srv|43200|_g3XlBcu2lmD9oDexD2NLWSmah2Nu3XcDrl93m9PwXY|m3u8||master|0fxcyc2ajhp1_x|00076|01|hls2|to|s01|delivery|storage|moon|sources|setup'''.split('|')))
self.assertEqual(jsi.call_function('f', '''h 7=g("1j");7.7h({7g:[{33:"w://7f-7e-7d-7c.v.7b/7a/79/78/77/76.74?t=73&s=2s&e=72&f=2t&71=70.0.0.1&6z=6y&6x=6w"}],6v:"w://32.v.u/6u.31",16:"r%",15:"r%",6t:"6s",6r:"",6q:"l",6p:"l",6o:"6n",6m:\'6l\',6k:"6j",9:[{33:"/2u?b=6i&n=50&6h=w://32.v.u/6g.31",6f:"6e"}],1y:{6d:1,6c:\'#6b\',6a:\'#69\',68:"67",66:30,65:r,},"64":{63:"%62 2m%m%61%5z%5y%5x.u%5w%5v%5u.2y%22 2k%m%1o%22 5t%m%1o%22 5s%m%1o%22 2j%m%5r%22 16%m%5q%22 15%m%5p%22 5o%2z%5n%5m%2z",5l:"w://v.u/d/1k/5k.2y",5j:[]},\'5i\':{"5h":"5g"},5f:"5e",5d:"w://v.u",5c:{},5b:l,1x:[0.25,0.50,0.75,1,1.25,1.5,2]});h 1m,1n,5a;h 59=0,58=0;h 7=g("1j");h 2x=0,57=0,56=0;$.55({54:{\'53-52\':\'2i-51\'}});7.j(\'4z\',6(x){c(5>0&&x.1l>=5&&1n!=1){1n=1;$(\'q.4y\').4x(\'4w\')}});7.j(\'13\',6(x){2x=x.1l});7.j(\'2g\',6(x){2w(x)});7.j(\'4v\',6(){$(\'q.2v\').4u()});6 2w(x){$(\'q.2v\').4t();c(1m)19;1m=1;17=0;c(4s.4r===l){17=1}$.4q(\'/2u?b=4p&2l=1k&4o=2t-4n-4m-2s-4l&4k=&4j=&4i=&17=\'+17,6(2r){$(\'#4h\').4g(2r)});$(\'.3-8-4f-4e:4d("4c")\').2h(6(e){2q();g().4b(0);g().4a(l)});6 2q(){h $14=$("<q />").2p({1l:"49",16:"r%",15:"r%",48:0,2n:0,2o:47,46:"45(10%, 10%, 10%, 0.4)","44-43":"42"});$("<41 />").2p({16:"60%",15:"60%",2o:40,"3z-2n":"3y"}).3x({\'2m\':\'/?b=3w&2l=1k\',\'2k\':\'0\',\'2j\':\'2i\'}).2f($14);$14.2h(6(){$(3v).3u();g().2g()});$14.2f($(\'#1j\'))}g().13(0);}6 3t(){h 9=7.1b(2e);2d.2c(9);c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==2e){2d.2c(\'!!=\'+i);7.1p(i)}}}}7.j(\'3s\',6(){g().1h("/2a/3r.29","3q 10 28",6(){g().13(g().27()+10)},"2b");$("q[26=2b]").23().21(\'.3-20-1z\');g().1h("/2a/3p.29","3o 10 28",6(){h 12=g().27()-10;c(12<0)12=0;g().13(12)},"24");$("q[26=24]").23().21(\'.3-20-1z\');});6 1i(){}7.j(\'3n\',6(){1i()});7.j(\'3m\',6(){1i()});7.j("k",6(y){h 9=7.1b();c(9.n<2)19;$(\'.3-8-3l-3k\').3j(6(){$(\'#3-8-a-k\').1e(\'3-8-a-z\');$(\'.3-a-k\').p(\'o-1f\',\'11\')});7.1h("/3i/3h.3g","3f 3e",6(){$(\'.3-1w\').3d(\'3-8-1v\');$(\'.3-8-1y, .3-8-1x\').p(\'o-1g\',\'11\');c($(\'.3-1w\').3c(\'3-8-1v\')){$(\'.3-a-k\').p(\'o-1g\',\'l\');$(\'.3-a-k\').p(\'o-1f\',\'l\');$(\'.3-8-a\').1e(\'3-8-a-z\');$(\'.3-8-a:1u\').3b(\'3-8-a-z\')}3a{$(\'.3-a-k\').p(\'o-1g\',\'11\');$(\'.3-a-k\').p(\'o-1f\',\'11\');$(\'.3-8-a:1u\').1e(\'3-8-a-z\')}},"39");7.j("38",6(y){1d.37(\'1c\',y.9[y.36].1a)});c(1d.1t(\'1c\')){35("1s(1d.1t(\'1c\'));",34)}});h 18;6 1s(1q){h 9=7.1b();c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==1q){c(i==18){19}18=i;7.1p(i)}}}}',36,270,'|||jw|||function|player|settings|tracks|submenu||if||||jwplayer|var||on|audioTracks|true|3D|length|aria|attr|div|100|||sx|filemoon|https||event|active||false|tt|seek|dd|height|width|adb|current_audio|return|name|getAudioTracks|default_audio|localStorage|removeClass|expanded|checked|addButton|callMeMaybe|vplayer|0fxcyc2ajhp1|position|vvplay|vvad|220|setCurrentAudioTrack|audio_name|for|audio_set|getItem|last|open|controls|playbackRates|captions|rewind|icon|insertAfter||detach|ff00||button|getPosition|sec|png|player8|ff11|log|console|track_name|appendTo|play|click|no|scrolling|frameborder|file_code|src|top|zIndex|css|showCCform|data|1662367683|383371|dl|video_ad|doPlay|prevt|mp4|3E||jpg|thumbs|file|300|setTimeout|currentTrack|setItem|audioTrackChanged|dualSound|else|addClass|hasClass|toggleClass|Track|Audio|svg|dualy|images|mousedown|buttons|topbar|playAttemptFailed|beforePlay|Rewind|fr|Forward|ff|ready|set_audio_track|remove|this|upload_srt|prop|50px|margin|1000001|iframe|center|align|text|rgba|background|1000000|left|absolute|pause|setCurrentCaptions|Upload|contains|item|content|html|fviews|referer|prem|embed|3e57249ef633e0d03bf76ceb8d8a4b65|216|83|hash|view|get|TokenZir|window|hide|show|complete|slow|fadeIn|video_ad_fadein|time||cache|Cache|Content|headers|ajaxSetup|v2done|tott|vastdone2|vastdone1|vvbefore|playbackRateControls|cast|aboutlink|FileMoon|abouttext|UHD|1870|qualityLabels|sites|GNOME_POWER|link|2Fiframe|3C|allowfullscreen|22360|22640|22no|marginheight|marginwidth|2FGNOME_POWER|2F0fxcyc2ajhp1|2Fe|2Ffilemoon|2F|3A||22https|3Ciframe|code|sharing|fontOpacity|backgroundOpacity|Tahoma|fontFamily|303030|backgroundColor|FFFFFF|color|userFontScale|thumbnails|kind|0fxcyc2ajhp10000|url|get_slides|start|startparam|none|preload|html5|primary|hlshtml|androidhls|duration|uniform|stretching|0fxcyc2ajhp1_xt|image|2048|sp|6871|asn|127|srv|43200|_g3XlBcu2lmD9oDexD2NLWSmah2Nu3XcDrl93m9PwXY|m3u8||master|0fxcyc2ajhp1_x|00076|01|hls2|to|s01|delivery|storage|moon|sources|setup'''.split('|'))) # noqa: SIM905
def test_join(self):
test_input = list('test')
@ -403,6 +404,8 @@ def test_split(self):
test_result = list('test')
tests = [
'function f(a, b){return a.split(b)}',
'function f(a, b){return a["split"](b)}',
'function f(a, b){let x = ["split"]; return a[x[0]](b)}',
'function f(a, b){return String.prototype.split.call(a, b)}',
'function f(a, b){return String.prototype.split.apply(a, [b])}',
]
@ -441,6 +444,9 @@ def test_slice(self):
self._test('function f(){return "012345678".slice(-1, 1)}', '')
self._test('function f(){return "012345678".slice(-3, -1)}', '67')
def test_splice(self):
self._test('function f(){var T = ["0", "1", "2"]; T["splice"](2, 1, "0")[0]; return T }', ['0', '1', '0'])
def test_js_number_to_string(self):
for test, radix, expected in [
(0, None, '0'),
@ -462,6 +468,24 @@ def test_js_number_to_string(self):
]:
assert js_number_to_string(test, radix) == expected
def test_extract_function(self):
jsi = JSInterpreter('function a(b) { return b + 1; }')
func = jsi.extract_function('a')
self.assertEqual(func([2]), 3)
def test_extract_function_with_global_stack(self):
jsi = JSInterpreter('function c(d) { return d + e + f + g; }')
func = jsi.extract_function('c', {'e': 10}, {'f': 100, 'g': 1000})
self.assertEqual(func([1]), 1111)
def test_increment_decrement(self):
self._test('function f() { var x = 1; return ++x; }', 2)
self._test('function f() { var x = 1; return x++; }', 1)
self._test('function f() { var x = 1; x--; return x }', 0)
self._test('function f() { var y; var x = 1; x++, --x, x--, x--, y="z", "abc", x++; return --x }', -1)
self._test('function f() { var a = "test--"; return a; }', 'test--')
self._test('function f() { var b = 1; var a = "b--"; return a; }', 'b--')
if __name__ == '__main__':
unittest.main()

View File

@ -39,6 +39,7 @@
from yt_dlp.dependencies import brotli, curl_cffi, requests, urllib3
from yt_dlp.networking import (
HEADRequest,
PATCHRequest,
PUTRequest,
Request,
RequestDirector,
@ -614,7 +615,6 @@ def test_source_address(self, handler):
rh, Request(f'http://127.0.0.1:{self.http_port}/source_address')).read().decode()
assert source_address == data
# Not supported by CurlCFFI
@pytest.mark.skip_handler('CurlCFFI', 'not supported by curl-cffi')
def test_gzip_trailing_garbage(self, handler):
with handler() as rh:
@ -1857,6 +1857,7 @@ def test_method(self):
def test_request_helpers(self):
assert HEADRequest('http://example.com').method == 'HEAD'
assert PATCHRequest('http://example.com').method == 'PATCH'
assert PUTRequest('http://example.com').method == 'PUT'
def test_headers(self):

View File

@ -20,7 +20,6 @@
add_accept_encoding_header,
get_redirect_method,
make_socks_proxy_opts,
select_proxy,
ssl_load_certs,
)
from yt_dlp.networking.exceptions import (
@ -28,7 +27,7 @@
IncompleteRead,
)
from yt_dlp.socks import ProxyType
from yt_dlp.utils.networking import HTTPHeaderDict
from yt_dlp.utils.networking import HTTPHeaderDict, select_proxy
TEST_DIR = os.path.dirname(os.path.abspath(__file__))

71
test/test_pot/conftest.py Normal file
View File

@ -0,0 +1,71 @@
import collections
import pytest
from yt_dlp import YoutubeDL
from yt_dlp.cookies import YoutubeDLCookieJar
from yt_dlp.extractor.common import InfoExtractor
from yt_dlp.extractor.youtube.pot._provider import IEContentProviderLogger
from yt_dlp.extractor.youtube.pot.provider import PoTokenRequest, PoTokenContext
from yt_dlp.utils.networking import HTTPHeaderDict
class MockLogger(IEContentProviderLogger):
log_level = IEContentProviderLogger.LogLevel.TRACE
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.messages = collections.defaultdict(list)
def trace(self, message: str):
self.messages['trace'].append(message)
def debug(self, message: str):
self.messages['debug'].append(message)
def info(self, message: str):
self.messages['info'].append(message)
def warning(self, message: str, *, once=False):
self.messages['warning'].append(message)
def error(self, message: str):
self.messages['error'].append(message)
@pytest.fixture
def ie() -> InfoExtractor:
ydl = YoutubeDL()
return ydl.get_info_extractor('Youtube')
@pytest.fixture
def logger() -> MockLogger:
return MockLogger()
@pytest.fixture()
def pot_request() -> PoTokenRequest:
return PoTokenRequest(
context=PoTokenContext.GVS,
innertube_context={'client': {'clientName': 'WEB'}},
innertube_host='youtube.com',
session_index=None,
player_url=None,
is_authenticated=False,
video_webpage=None,
visitor_data='example-visitor-data',
data_sync_id='example-data-sync-id',
video_id='example-video-id',
request_cookiejar=YoutubeDLCookieJar(),
request_proxy=None,
request_headers=HTTPHeaderDict(),
request_timeout=None,
request_source_address=None,
request_verify_tls=True,
bypass_cache=False,
)

View File

@ -0,0 +1,117 @@
import threading
import time
from collections import OrderedDict
import pytest
from yt_dlp.extractor.youtube.pot._provider import IEContentProvider, BuiltinIEContentProvider
from yt_dlp.utils import bug_reports_message
from yt_dlp.extractor.youtube.pot._builtin.memory_cache import MemoryLRUPCP, memorylru_preference, initialize_global_cache
from yt_dlp.version import __version__
from yt_dlp.extractor.youtube.pot._registry import _pot_cache_providers, _pot_memory_cache
class TestMemoryLRUPCS:
def test_base_type(self):
assert issubclass(MemoryLRUPCP, IEContentProvider)
assert issubclass(MemoryLRUPCP, BuiltinIEContentProvider)
@pytest.fixture
def pcp(self, ie, logger) -> MemoryLRUPCP:
return MemoryLRUPCP(ie, logger, {}, initialize_cache=lambda max_size: (OrderedDict(), threading.Lock(), max_size))
def test_is_registered(self):
assert _pot_cache_providers.value.get('MemoryLRU') == MemoryLRUPCP
def test_initialization(self, pcp):
assert pcp.PROVIDER_NAME == 'memory'
assert pcp.PROVIDER_VERSION == __version__
assert pcp.BUG_REPORT_MESSAGE == bug_reports_message(before='')
assert pcp.is_available()
def test_store_and_get(self, pcp):
pcp.store('key1', 'value1', int(time.time()) + 60)
assert pcp.get('key1') == 'value1'
assert len(pcp.cache) == 1
def test_store_ignore_expired(self, pcp):
pcp.store('key1', 'value1', int(time.time()) - 1)
assert len(pcp.cache) == 0
assert pcp.get('key1') is None
assert len(pcp.cache) == 0
def test_store_override_existing_key(self, ie, logger):
MAX_SIZE = 2
pcp = MemoryLRUPCP(ie, logger, {}, initialize_cache=lambda max_size: (OrderedDict(), threading.Lock(), MAX_SIZE))
pcp.store('key1', 'value1', int(time.time()) + 60)
pcp.store('key2', 'value2', int(time.time()) + 60)
assert len(pcp.cache) == 2
pcp.store('key1', 'value2', int(time.time()) + 60)
# Ensure that the override key gets added to the end of the cache instead of in the same position
pcp.store('key3', 'value3', int(time.time()) + 60)
assert pcp.get('key1') == 'value2'
def test_store_ignore_expired_existing_key(self, pcp):
pcp.store('key1', 'value2', int(time.time()) + 60)
pcp.store('key1', 'value1', int(time.time()) - 1)
assert len(pcp.cache) == 1
assert pcp.get('key1') == 'value2'
assert len(pcp.cache) == 1
def test_get_key_expired(self, pcp):
pcp.store('key1', 'value1', int(time.time()) + 60)
assert pcp.get('key1') == 'value1'
assert len(pcp.cache) == 1
pcp.cache['key1'] = ('value1', int(time.time()) - 1)
assert pcp.get('key1') is None
assert len(pcp.cache) == 0
def test_lru_eviction(self, ie, logger):
MAX_SIZE = 2
provider = MemoryLRUPCP(ie, logger, {}, initialize_cache=lambda max_size: (OrderedDict(), threading.Lock(), MAX_SIZE))
provider.store('key1', 'value1', int(time.time()) + 5)
provider.store('key2', 'value2', int(time.time()) + 5)
assert len(provider.cache) == 2
assert provider.get('key1') == 'value1'
provider.store('key3', 'value3', int(time.time()) + 5)
assert len(provider.cache) == 2
assert provider.get('key2') is None
provider.store('key4', 'value4', int(time.time()) + 5)
assert len(provider.cache) == 2
assert provider.get('key1') is None
assert provider.get('key3') == 'value3'
assert provider.get('key4') == 'value4'
def test_delete(self, pcp):
pcp.store('key1', 'value1', int(time.time()) + 5)
assert len(pcp.cache) == 1
assert pcp.get('key1') == 'value1'
pcp.delete('key1')
assert len(pcp.cache) == 0
assert pcp.get('key1') is None
def test_use_global_cache_default(self, ie, logger):
pcp = MemoryLRUPCP(ie, logger, {})
assert pcp.max_size == _pot_memory_cache.value['max_size'] == 25
assert pcp.cache is _pot_memory_cache.value['cache']
assert pcp.lock is _pot_memory_cache.value['lock']
pcp2 = MemoryLRUPCP(ie, logger, {})
assert pcp.max_size == pcp2.max_size == _pot_memory_cache.value['max_size'] == 25
assert pcp.cache is pcp2.cache is _pot_memory_cache.value['cache']
assert pcp.lock is pcp2.lock is _pot_memory_cache.value['lock']
def test_fail_max_size_change_global(self, ie, logger):
pcp = MemoryLRUPCP(ie, logger, {})
assert pcp.max_size == _pot_memory_cache.value['max_size'] == 25
with pytest.raises(ValueError, match='Cannot change max_size of initialized global memory cache'):
initialize_global_cache(50)
assert pcp.max_size == _pot_memory_cache.value['max_size'] == 25
def test_memory_lru_preference(self, pcp, ie, pot_request):
assert memorylru_preference(pcp, pot_request) == 10000

View File

@ -0,0 +1,47 @@
import pytest
from yt_dlp.extractor.youtube.pot.provider import (
PoTokenContext,
)
from yt_dlp.extractor.youtube.pot.utils import get_webpo_content_binding, ContentBindingType
class TestGetWebPoContentBinding:
@pytest.mark.parametrize('client_name, context, is_authenticated, expected', [
*[(client, context, is_authenticated, expected) for client in [
'WEB', 'MWEB', 'TVHTML5', 'WEB_EMBEDDED_PLAYER', 'WEB_CREATOR', 'TVHTML5_SIMPLY_EMBEDDED_PLAYER']
for context, is_authenticated, expected in [
(PoTokenContext.GVS, False, ('example-visitor-data', ContentBindingType.VISITOR_DATA)),
(PoTokenContext.PLAYER, False, ('example-video-id', ContentBindingType.VIDEO_ID)),
(PoTokenContext.SUBS, False, ('example-video-id', ContentBindingType.VIDEO_ID)),
(PoTokenContext.GVS, True, ('example-data-sync-id', ContentBindingType.DATASYNC_ID)),
]],
('WEB_REMIX', PoTokenContext.GVS, False, ('example-visitor-data', ContentBindingType.VISITOR_DATA)),
('WEB_REMIX', PoTokenContext.PLAYER, False, ('example-visitor-data', ContentBindingType.VISITOR_DATA)),
('ANDROID', PoTokenContext.GVS, False, (None, None)),
('IOS', PoTokenContext.GVS, False, (None, None)),
])
def test_get_webpo_content_binding(self, pot_request, client_name, context, is_authenticated, expected):
pot_request.innertube_context['client']['clientName'] = client_name
pot_request.context = context
pot_request.is_authenticated = is_authenticated
assert get_webpo_content_binding(pot_request) == expected
def test_extract_visitor_id(self, pot_request):
pot_request.visitor_data = 'CgsxMjNhYmNYWVpfLSiA4s%2DqBg%3D%3D'
assert get_webpo_content_binding(pot_request, bind_to_visitor_id=True) == ('123abcXYZ_-', ContentBindingType.VISITOR_ID)
def test_invalid_visitor_id(self, pot_request):
# visitor id not alphanumeric (i.e. protobuf extraction failed)
pot_request.visitor_data = 'CggxMjM0NTY3OCiA4s-qBg%3D%3D'
assert get_webpo_content_binding(pot_request, bind_to_visitor_id=True) == (pot_request.visitor_data, ContentBindingType.VISITOR_DATA)
def test_no_visitor_id(self, pot_request):
pot_request.visitor_data = 'KIDiz6oG'
assert get_webpo_content_binding(pot_request, bind_to_visitor_id=True) == (pot_request.visitor_data, ContentBindingType.VISITOR_DATA)
def test_invalid_base64(self, pot_request):
pot_request.visitor_data = 'invalid-base64'
assert get_webpo_content_binding(pot_request, bind_to_visitor_id=True) == (pot_request.visitor_data, ContentBindingType.VISITOR_DATA)

View File

@ -0,0 +1,92 @@
import pytest
from yt_dlp.extractor.youtube.pot._provider import IEContentProvider, BuiltinIEContentProvider
from yt_dlp.extractor.youtube.pot.cache import CacheProviderWritePolicy
from yt_dlp.utils import bug_reports_message
from yt_dlp.extractor.youtube.pot.provider import (
PoTokenRequest,
PoTokenContext,
)
from yt_dlp.version import __version__
from yt_dlp.extractor.youtube.pot._builtin.webpo_cachespec import WebPoPCSP
from yt_dlp.extractor.youtube.pot._registry import _pot_pcs_providers
@pytest.fixture()
def pot_request(pot_request) -> PoTokenRequest:
pot_request.visitor_data = 'CgsxMjNhYmNYWVpfLSiA4s%2DqBg%3D%3D' # visitor_id=123abcXYZ_-
return pot_request
class TestWebPoPCSP:
def test_base_type(self):
assert issubclass(WebPoPCSP, IEContentProvider)
assert issubclass(WebPoPCSP, BuiltinIEContentProvider)
def test_init(self, ie, logger):
pcs = WebPoPCSP(ie=ie, logger=logger, settings={})
assert pcs.PROVIDER_NAME == 'webpo'
assert pcs.PROVIDER_VERSION == __version__
assert pcs.BUG_REPORT_MESSAGE == bug_reports_message(before='')
assert pcs.is_available()
def test_is_registered(self):
assert _pot_pcs_providers.value.get('WebPo') == WebPoPCSP
@pytest.mark.parametrize('client_name, context, is_authenticated', [
('ANDROID', PoTokenContext.GVS, False),
('IOS', PoTokenContext.GVS, False),
('IOS', PoTokenContext.PLAYER, False),
])
def test_not_supports(self, ie, logger, pot_request, client_name, context, is_authenticated):
pcs = WebPoPCSP(ie=ie, logger=logger, settings={})
pot_request.innertube_context['client']['clientName'] = client_name
pot_request.context = context
pot_request.is_authenticated = is_authenticated
assert pcs.generate_cache_spec(pot_request) is None
@pytest.mark.parametrize('client_name, context, is_authenticated, remote_host, source_address, request_proxy, expected', [
*[(client, context, is_authenticated, remote_host, source_address, request_proxy, expected) for client in [
'WEB', 'MWEB', 'TVHTML5', 'WEB_EMBEDDED_PLAYER', 'WEB_CREATOR', 'TVHTML5_SIMPLY_EMBEDDED_PLAYER']
for context, is_authenticated, remote_host, source_address, request_proxy, expected in [
(PoTokenContext.GVS, False, 'example-remote-host', 'example-source-address', 'example-request-proxy', {'t': 'webpo', 'ip': 'example-remote-host', 'sa': 'example-source-address', 'px': 'example-request-proxy', 'cb': '123abcXYZ_-', 'cbt': 'visitor_id'}),
(PoTokenContext.PLAYER, False, 'example-remote-host', 'example-source-address', 'example-request-proxy', {'t': 'webpo', 'ip': 'example-remote-host', 'sa': 'example-source-address', 'px': 'example-request-proxy', 'cb': '123abcXYZ_-', 'cbt': 'video_id'}),
(PoTokenContext.GVS, True, 'example-remote-host', 'example-source-address', 'example-request-proxy', {'t': 'webpo', 'ip': 'example-remote-host', 'sa': 'example-source-address', 'px': 'example-request-proxy', 'cb': 'example-data-sync-id', 'cbt': 'datasync_id'}),
]],
('WEB_REMIX', PoTokenContext.PLAYER, False, 'example-remote-host', 'example-source-address', 'example-request-proxy', {'t': 'webpo', 'ip': 'example-remote-host', 'sa': 'example-source-address', 'px': 'example-request-proxy', 'cb': '123abcXYZ_-', 'cbt': 'visitor_id'}),
('WEB', PoTokenContext.GVS, False, None, None, None, {'t': 'webpo', 'cb': '123abcXYZ_-', 'cbt': 'visitor_id', 'ip': None, 'sa': None, 'px': None}),
('TVHTML5', PoTokenContext.PLAYER, False, None, None, 'http://example.com', {'t': 'webpo', 'cb': '123abcXYZ_-', 'cbt': 'video_id', 'ip': None, 'sa': None, 'px': 'http://example.com'}),
])
def test_generate_key_bindings(self, ie, logger, pot_request, client_name, context, is_authenticated, remote_host, source_address, request_proxy, expected):
pcs = WebPoPCSP(ie=ie, logger=logger, settings={})
pot_request.innertube_context['client']['clientName'] = client_name
pot_request.context = context
pot_request.is_authenticated = is_authenticated
pot_request.innertube_context['client']['remoteHost'] = remote_host
pot_request.request_source_address = source_address
pot_request.request_proxy = request_proxy
pot_request.video_id = '123abcXYZ_-' # same as visitor id to test type
assert pcs.generate_cache_spec(pot_request).key_bindings == expected
def test_no_bind_visitor_id(self, ie, logger, pot_request):
# Should not bind to visitor id if setting is set to False
pcs = WebPoPCSP(ie=ie, logger=logger, settings={'bind_to_visitor_id': ['false']})
pot_request.innertube_context['client']['clientName'] = 'WEB'
pot_request.context = PoTokenContext.GVS
pot_request.is_authenticated = False
assert pcs.generate_cache_spec(pot_request).key_bindings == {'t': 'webpo', 'ip': None, 'sa': None, 'px': None, 'cb': 'CgsxMjNhYmNYWVpfLSiA4s%2DqBg%3D%3D', 'cbt': 'visitor_data'}
def test_default_ttl(self, ie, logger, pot_request):
pcs = WebPoPCSP(ie=ie, logger=logger, settings={})
assert pcs.generate_cache_spec(pot_request).default_ttl == 6 * 60 * 60 # should default to 6 hours
def test_write_policy(self, ie, logger, pot_request):
pcs = WebPoPCSP(ie=ie, logger=logger, settings={})
pot_request.context = PoTokenContext.GVS
assert pcs.generate_cache_spec(pot_request).write_policy == CacheProviderWritePolicy.WRITE_ALL
pot_request.context = PoTokenContext.PLAYER
assert pcs.generate_cache_spec(pot_request).write_policy == CacheProviderWritePolicy.WRITE_FIRST

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,629 @@
import pytest
from yt_dlp.extractor.youtube.pot._provider import IEContentProvider
from yt_dlp.cookies import YoutubeDLCookieJar
from yt_dlp.utils.networking import HTTPHeaderDict
from yt_dlp.extractor.youtube.pot.provider import (
PoTokenRequest,
PoTokenContext,
ExternalRequestFeature,
)
from yt_dlp.extractor.youtube.pot.cache import (
PoTokenCacheProvider,
PoTokenCacheSpec,
PoTokenCacheSpecProvider,
CacheProviderWritePolicy,
)
import yt_dlp.extractor.youtube.pot.cache as cache
from yt_dlp.networking import Request
from yt_dlp.extractor.youtube.pot.provider import (
PoTokenResponse,
PoTokenProvider,
PoTokenProviderRejectedRequest,
provider_bug_report_message,
register_provider,
register_preference,
)
from yt_dlp.extractor.youtube.pot._registry import _pot_providers, _ptp_preferences, _pot_pcs_providers, _pot_cache_providers, _pot_cache_provider_preferences
class ExamplePTP(PoTokenProvider):
PROVIDER_NAME = 'example'
PROVIDER_VERSION = '0.0.1'
BUG_REPORT_LOCATION = 'https://example.com/issues'
_SUPPORTED_CLIENTS = ('WEB',)
_SUPPORTED_CONTEXTS = (PoTokenContext.GVS, )
_SUPPORTED_EXTERNAL_REQUEST_FEATURES = (
ExternalRequestFeature.PROXY_SCHEME_HTTP,
ExternalRequestFeature.PROXY_SCHEME_SOCKS5H,
)
def is_available(self) -> bool:
return True
def _real_request_pot(self, request: PoTokenRequest) -> PoTokenResponse:
return PoTokenResponse('example-token', expires_at=123)
class ExampleCacheProviderPCP(PoTokenCacheProvider):
PROVIDER_NAME = 'example'
PROVIDER_VERSION = '0.0.1'
BUG_REPORT_LOCATION = 'https://example.com/issues'
def is_available(self) -> bool:
return True
def get(self, key: str):
return 'example-cache'
def store(self, key: str, value: str, expires_at: int):
pass
def delete(self, key: str):
pass
class ExampleCacheSpecProviderPCSP(PoTokenCacheSpecProvider):
PROVIDER_NAME = 'example'
PROVIDER_VERSION = '0.0.1'
BUG_REPORT_LOCATION = 'https://example.com/issues'
def generate_cache_spec(self, request: PoTokenRequest):
return PoTokenCacheSpec(
key_bindings={'field': 'example-key'},
default_ttl=60,
write_policy=CacheProviderWritePolicy.WRITE_FIRST,
)
class TestPoTokenProvider:
def test_base_type(self):
assert issubclass(PoTokenProvider, IEContentProvider)
def test_create_provider_missing_fetch_method(self, ie, logger):
class MissingMethodsPTP(PoTokenProvider):
def is_available(self) -> bool:
return True
with pytest.raises(TypeError):
MissingMethodsPTP(ie=ie, logger=logger, settings={})
def test_create_provider_missing_available_method(self, ie, logger):
class MissingMethodsPTP(PoTokenProvider):
def _real_request_pot(self, request: PoTokenRequest) -> PoTokenResponse:
raise PoTokenProviderRejectedRequest('Not implemented')
with pytest.raises(TypeError):
MissingMethodsPTP(ie=ie, logger=logger, settings={})
def test_barebones_provider(self, ie, logger):
class BarebonesProviderPTP(PoTokenProvider):
def is_available(self) -> bool:
return True
def _real_request_pot(self, request: PoTokenRequest) -> PoTokenResponse:
raise PoTokenProviderRejectedRequest('Not implemented')
provider = BarebonesProviderPTP(ie=ie, logger=logger, settings={})
assert provider.PROVIDER_NAME == 'BarebonesProvider'
assert provider.PROVIDER_KEY == 'BarebonesProvider'
assert provider.PROVIDER_VERSION == '0.0.0'
assert provider.BUG_REPORT_MESSAGE == 'please report this issue to the provider developer at (developer has not provided a bug report location) .'
def test_example_provider_success(self, ie, logger, pot_request):
provider = ExamplePTP(ie=ie, logger=logger, settings={})
assert provider.PROVIDER_NAME == 'example'
assert provider.PROVIDER_KEY == 'Example'
assert provider.PROVIDER_VERSION == '0.0.1'
assert provider.BUG_REPORT_MESSAGE == 'please report this issue to the provider developer at https://example.com/issues .'
assert provider.is_available()
response = provider.request_pot(pot_request)
assert response.po_token == 'example-token'
assert response.expires_at == 123
def test_provider_unsupported_context(self, ie, logger, pot_request):
provider = ExamplePTP(ie=ie, logger=logger, settings={})
pot_request.context = PoTokenContext.PLAYER
with pytest.raises(PoTokenProviderRejectedRequest):
provider.request_pot(pot_request)
def test_provider_unsupported_client(self, ie, logger, pot_request):
provider = ExamplePTP(ie=ie, logger=logger, settings={})
pot_request.innertube_context['client']['clientName'] = 'ANDROID'
with pytest.raises(PoTokenProviderRejectedRequest):
provider.request_pot(pot_request)
def test_provider_unsupported_proxy_scheme(self, ie, logger, pot_request):
provider = ExamplePTP(ie=ie, logger=logger, settings={})
pot_request.request_proxy = 'socks4://example.com'
with pytest.raises(
PoTokenProviderRejectedRequest,
match='External requests by "example" provider do not support proxy scheme "socks4". Supported proxy '
'schemes: http, socks5h',
):
provider.request_pot(pot_request)
pot_request.request_proxy = 'http://example.com'
assert provider.request_pot(pot_request)
def test_provider_ignore_external_request_features(self, ie, logger, pot_request):
class InternalPTP(ExamplePTP):
_SUPPORTED_EXTERNAL_REQUEST_FEATURES = None
provider = InternalPTP(ie=ie, logger=logger, settings={})
pot_request.request_proxy = 'socks5://example.com'
assert provider.request_pot(pot_request)
pot_request.request_source_address = '0.0.0.0'
assert provider.request_pot(pot_request)
def test_provider_unsupported_external_request_source_address(self, ie, logger, pot_request):
class InternalPTP(ExamplePTP):
_SUPPORTED_EXTERNAL_REQUEST_FEATURES = tuple()
provider = InternalPTP(ie=ie, logger=logger, settings={})
pot_request.request_source_address = None
assert provider.request_pot(pot_request)
pot_request.request_source_address = '0.0.0.0'
with pytest.raises(
PoTokenProviderRejectedRequest,
match='External requests by "example" provider do not support setting source address',
):
provider.request_pot(pot_request)
def test_provider_supported_external_request_source_address(self, ie, logger, pot_request):
class InternalPTP(ExamplePTP):
_SUPPORTED_EXTERNAL_REQUEST_FEATURES = (
ExternalRequestFeature.SOURCE_ADDRESS,
)
provider = InternalPTP(ie=ie, logger=logger, settings={})
pot_request.request_source_address = None
assert provider.request_pot(pot_request)
pot_request.request_source_address = '0.0.0.0'
assert provider.request_pot(pot_request)
def test_provider_unsupported_external_request_tls_verification(self, ie, logger, pot_request):
class InternalPTP(ExamplePTP):
_SUPPORTED_EXTERNAL_REQUEST_FEATURES = tuple()
provider = InternalPTP(ie=ie, logger=logger, settings={})
pot_request.request_verify_tls = True
assert provider.request_pot(pot_request)
pot_request.request_verify_tls = False
with pytest.raises(
PoTokenProviderRejectedRequest,
match='External requests by "example" provider do not support ignoring TLS certificate failures',
):
provider.request_pot(pot_request)
def test_provider_supported_external_request_tls_verification(self, ie, logger, pot_request):
class InternalPTP(ExamplePTP):
_SUPPORTED_EXTERNAL_REQUEST_FEATURES = (
ExternalRequestFeature.DISABLE_TLS_VERIFICATION,
)
provider = InternalPTP(ie=ie, logger=logger, settings={})
pot_request.request_verify_tls = True
assert provider.request_pot(pot_request)
pot_request.request_verify_tls = False
assert provider.request_pot(pot_request)
def test_provider_request_webpage(self, ie, logger, pot_request):
provider = ExamplePTP(ie=ie, logger=logger, settings={})
cookiejar = YoutubeDLCookieJar()
pot_request.request_headers = HTTPHeaderDict({'User-Agent': 'example-user-agent'})
pot_request.request_proxy = 'socks5://example-proxy.com'
pot_request.request_cookiejar = cookiejar
def mock_urlopen(request):
return request
ie._downloader.urlopen = mock_urlopen
sent_request = provider._request_webpage(Request(
'https://example.com',
), pot_request=pot_request)
assert sent_request.url == 'https://example.com'
assert sent_request.headers['User-Agent'] == 'example-user-agent'
assert sent_request.proxies == {'all': 'socks5://example-proxy.com'}
assert sent_request.extensions['cookiejar'] is cookiejar
assert 'Requesting webpage' in logger.messages['info']
def test_provider_request_webpage_override(self, ie, logger, pot_request):
provider = ExamplePTP(ie=ie, logger=logger, settings={})
cookiejar_request = YoutubeDLCookieJar()
pot_request.request_headers = HTTPHeaderDict({'User-Agent': 'example-user-agent'})
pot_request.request_proxy = 'socks5://example-proxy.com'
pot_request.request_cookiejar = cookiejar_request
def mock_urlopen(request):
return request
ie._downloader.urlopen = mock_urlopen
sent_request = provider._request_webpage(Request(
'https://example.com',
headers={'User-Agent': 'override-user-agent-override'},
proxies={'http': 'http://example-proxy-override.com'},
extensions={'cookiejar': YoutubeDLCookieJar()},
), pot_request=pot_request, note='Custom requesting webpage')
assert sent_request.url == 'https://example.com'
assert sent_request.headers['User-Agent'] == 'override-user-agent-override'
assert sent_request.proxies == {'http': 'http://example-proxy-override.com'}
assert sent_request.extensions['cookiejar'] is not cookiejar_request
assert 'Custom requesting webpage' in logger.messages['info']
def test_provider_request_webpage_no_log(self, ie, logger, pot_request):
provider = ExamplePTP(ie=ie, logger=logger, settings={})
def mock_urlopen(request):
return request
ie._downloader.urlopen = mock_urlopen
sent_request = provider._request_webpage(Request(
'https://example.com',
), note=False)
assert sent_request.url == 'https://example.com'
assert 'info' not in logger.messages
def test_provider_request_webpage_no_pot_request(self, ie, logger):
provider = ExamplePTP(ie=ie, logger=logger, settings={})
def mock_urlopen(request):
return request
ie._downloader.urlopen = mock_urlopen
sent_request = provider._request_webpage(Request(
'https://example.com',
), pot_request=None)
assert sent_request.url == 'https://example.com'
def test_get_config_arg(self, ie, logger):
provider = ExamplePTP(ie=ie, logger=logger, settings={'abc': ['123D'], 'xyz': ['456a', '789B']})
assert provider._configuration_arg('abc') == ['123d']
assert provider._configuration_arg('abc', default=['default']) == ['123d']
assert provider._configuration_arg('ABC', default=['default']) == ['default']
assert provider._configuration_arg('abc', casesense=True) == ['123D']
assert provider._configuration_arg('xyz', casesense=False) == ['456a', '789b']
def test_require_class_end_with_suffix(self, ie, logger):
class InvalidSuffix(PoTokenProvider):
PROVIDER_NAME = 'invalid-suffix'
def _real_request_pot(self, request: PoTokenRequest) -> PoTokenResponse:
raise PoTokenProviderRejectedRequest('Not implemented')
def is_available(self) -> bool:
return True
provider = InvalidSuffix(ie=ie, logger=logger, settings={})
with pytest.raises(AssertionError):
provider.PROVIDER_KEY # noqa: B018
class TestPoTokenCacheProvider:
def test_base_type(self):
assert issubclass(PoTokenCacheProvider, IEContentProvider)
def test_create_provider_missing_get_method(self, ie, logger):
class MissingMethodsPCP(PoTokenCacheProvider):
def store(self, key: str, value: str, expires_at: int):
pass
def delete(self, key: str):
pass
def is_available(self) -> bool:
return True
with pytest.raises(TypeError):
MissingMethodsPCP(ie=ie, logger=logger, settings={})
def test_create_provider_missing_store_method(self, ie, logger):
class MissingMethodsPCP(PoTokenCacheProvider):
def get(self, key: str):
pass
def delete(self, key: str):
pass
def is_available(self) -> bool:
return True
with pytest.raises(TypeError):
MissingMethodsPCP(ie=ie, logger=logger, settings={})
def test_create_provider_missing_delete_method(self, ie, logger):
class MissingMethodsPCP(PoTokenCacheProvider):
def get(self, key: str):
pass
def store(self, key: str, value: str, expires_at: int):
pass
def is_available(self) -> bool:
return True
with pytest.raises(TypeError):
MissingMethodsPCP(ie=ie, logger=logger, settings={})
def test_create_provider_missing_is_available_method(self, ie, logger):
class MissingMethodsPCP(PoTokenCacheProvider):
def get(self, key: str):
pass
def store(self, key: str, value: str, expires_at: int):
pass
def delete(self, key: str):
pass
with pytest.raises(TypeError):
MissingMethodsPCP(ie=ie, logger=logger, settings={})
def test_barebones_provider(self, ie, logger):
class BarebonesProviderPCP(PoTokenCacheProvider):
def is_available(self) -> bool:
return True
def get(self, key: str):
return 'example-cache'
def store(self, key: str, value: str, expires_at: int):
pass
def delete(self, key: str):
pass
provider = BarebonesProviderPCP(ie=ie, logger=logger, settings={})
assert provider.PROVIDER_NAME == 'BarebonesProvider'
assert provider.PROVIDER_KEY == 'BarebonesProvider'
assert provider.PROVIDER_VERSION == '0.0.0'
assert provider.BUG_REPORT_MESSAGE == 'please report this issue to the provider developer at (developer has not provided a bug report location) .'
def test_create_provider_example(self, ie, logger):
provider = ExampleCacheProviderPCP(ie=ie, logger=logger, settings={})
assert provider.PROVIDER_NAME == 'example'
assert provider.PROVIDER_KEY == 'ExampleCacheProvider'
assert provider.PROVIDER_VERSION == '0.0.1'
assert provider.BUG_REPORT_MESSAGE == 'please report this issue to the provider developer at https://example.com/issues .'
assert provider.is_available()
def test_get_config_arg(self, ie, logger):
provider = ExampleCacheProviderPCP(ie=ie, logger=logger, settings={'abc': ['123D'], 'xyz': ['456a', '789B']})
assert provider._configuration_arg('abc') == ['123d']
assert provider._configuration_arg('abc', default=['default']) == ['123d']
assert provider._configuration_arg('ABC', default=['default']) == ['default']
assert provider._configuration_arg('abc', casesense=True) == ['123D']
assert provider._configuration_arg('xyz', casesense=False) == ['456a', '789b']
def test_require_class_end_with_suffix(self, ie, logger):
class InvalidSuffix(PoTokenCacheProvider):
def get(self, key: str):
return 'example-cache'
def store(self, key: str, value: str, expires_at: int):
pass
def delete(self, key: str):
pass
def is_available(self) -> bool:
return True
provider = InvalidSuffix(ie=ie, logger=logger, settings={})
with pytest.raises(AssertionError):
provider.PROVIDER_KEY # noqa: B018
class TestPoTokenCacheSpecProvider:
def test_base_type(self):
assert issubclass(PoTokenCacheSpecProvider, IEContentProvider)
def test_create_provider_missing_supports_method(self, ie, logger):
class MissingMethodsPCS(PoTokenCacheSpecProvider):
pass
with pytest.raises(TypeError):
MissingMethodsPCS(ie=ie, logger=logger, settings={})
def test_create_provider_barebones(self, ie, pot_request, logger):
class BarebonesProviderPCSP(PoTokenCacheSpecProvider):
def generate_cache_spec(self, request: PoTokenRequest):
return PoTokenCacheSpec(
default_ttl=100,
key_bindings={},
)
provider = BarebonesProviderPCSP(ie=ie, logger=logger, settings={})
assert provider.PROVIDER_NAME == 'BarebonesProvider'
assert provider.PROVIDER_KEY == 'BarebonesProvider'
assert provider.PROVIDER_VERSION == '0.0.0'
assert provider.BUG_REPORT_MESSAGE == 'please report this issue to the provider developer at (developer has not provided a bug report location) .'
assert provider.is_available()
assert provider.generate_cache_spec(request=pot_request).default_ttl == 100
assert provider.generate_cache_spec(request=pot_request).key_bindings == {}
assert provider.generate_cache_spec(request=pot_request).write_policy == CacheProviderWritePolicy.WRITE_ALL
def test_create_provider_example(self, ie, pot_request, logger):
provider = ExampleCacheSpecProviderPCSP(ie=ie, logger=logger, settings={})
assert provider.PROVIDER_NAME == 'example'
assert provider.PROVIDER_KEY == 'ExampleCacheSpecProvider'
assert provider.PROVIDER_VERSION == '0.0.1'
assert provider.BUG_REPORT_MESSAGE == 'please report this issue to the provider developer at https://example.com/issues .'
assert provider.is_available()
assert provider.generate_cache_spec(pot_request)
assert provider.generate_cache_spec(pot_request).key_bindings == {'field': 'example-key'}
assert provider.generate_cache_spec(pot_request).default_ttl == 60
assert provider.generate_cache_spec(pot_request).write_policy == CacheProviderWritePolicy.WRITE_FIRST
def test_get_config_arg(self, ie, logger):
provider = ExampleCacheSpecProviderPCSP(ie=ie, logger=logger, settings={'abc': ['123D'], 'xyz': ['456a', '789B']})
assert provider._configuration_arg('abc') == ['123d']
assert provider._configuration_arg('abc', default=['default']) == ['123d']
assert provider._configuration_arg('ABC', default=['default']) == ['default']
assert provider._configuration_arg('abc', casesense=True) == ['123D']
assert provider._configuration_arg('xyz', casesense=False) == ['456a', '789b']
def test_require_class_end_with_suffix(self, ie, logger):
class InvalidSuffix(PoTokenCacheSpecProvider):
def generate_cache_spec(self, request: PoTokenRequest):
return None
provider = InvalidSuffix(ie=ie, logger=logger, settings={})
with pytest.raises(AssertionError):
provider.PROVIDER_KEY # noqa: B018
class TestPoTokenRequest:
def test_copy_request(self, pot_request):
copied_request = pot_request.copy()
assert copied_request is not pot_request
assert copied_request.context == pot_request.context
assert copied_request.innertube_context == pot_request.innertube_context
assert copied_request.innertube_context is not pot_request.innertube_context
copied_request.innertube_context['client']['clientName'] = 'ANDROID'
assert pot_request.innertube_context['client']['clientName'] != 'ANDROID'
assert copied_request.innertube_host == pot_request.innertube_host
assert copied_request.session_index == pot_request.session_index
assert copied_request.player_url == pot_request.player_url
assert copied_request.is_authenticated == pot_request.is_authenticated
assert copied_request.visitor_data == pot_request.visitor_data
assert copied_request.data_sync_id == pot_request.data_sync_id
assert copied_request.video_id == pot_request.video_id
assert copied_request.request_cookiejar is pot_request.request_cookiejar
assert copied_request.request_proxy == pot_request.request_proxy
assert copied_request.request_headers == pot_request.request_headers
assert copied_request.request_headers is not pot_request.request_headers
assert copied_request.request_timeout == pot_request.request_timeout
assert copied_request.request_source_address == pot_request.request_source_address
assert copied_request.request_verify_tls == pot_request.request_verify_tls
assert copied_request.bypass_cache == pot_request.bypass_cache
def test_provider_bug_report_message(ie, logger):
provider = ExamplePTP(ie=ie, logger=logger, settings={})
assert provider.BUG_REPORT_MESSAGE == 'please report this issue to the provider developer at https://example.com/issues .'
message = provider_bug_report_message(provider)
assert message == '; please report this issue to the provider developer at https://example.com/issues .'
message_before = provider_bug_report_message(provider, before='custom message!')
assert message_before == 'custom message! Please report this issue to the provider developer at https://example.com/issues .'
def test_register_provider(ie):
@register_provider
class UnavailableProviderPTP(PoTokenProvider):
def is_available(self) -> bool:
return False
def _real_request_pot(self, request: PoTokenRequest) -> PoTokenResponse:
raise PoTokenProviderRejectedRequest('Not implemented')
assert _pot_providers.value.get('UnavailableProvider') == UnavailableProviderPTP
_pot_providers.value.pop('UnavailableProvider')
def test_register_pot_preference(ie):
before = len(_ptp_preferences.value)
@register_preference(ExamplePTP)
def unavailable_preference(provider: PoTokenProvider, request: PoTokenRequest):
return 1
assert len(_ptp_preferences.value) == before + 1
def test_register_cache_provider(ie):
@cache.register_provider
class UnavailableCacheProviderPCP(PoTokenCacheProvider):
def is_available(self) -> bool:
return False
def get(self, key: str):
return 'example-cache'
def store(self, key: str, value: str, expires_at: int):
pass
def delete(self, key: str):
pass
assert _pot_cache_providers.value.get('UnavailableCacheProvider') == UnavailableCacheProviderPCP
_pot_cache_providers.value.pop('UnavailableCacheProvider')
def test_register_cache_provider_spec(ie):
@cache.register_spec
class UnavailableCacheProviderPCSP(PoTokenCacheSpecProvider):
def is_available(self) -> bool:
return False
def generate_cache_spec(self, request: PoTokenRequest):
return None
assert _pot_pcs_providers.value.get('UnavailableCacheProvider') == UnavailableCacheProviderPCSP
_pot_pcs_providers.value.pop('UnavailableCacheProvider')
def test_register_cache_provider_preference(ie):
before = len(_pot_cache_provider_preferences.value)
@cache.register_preference(ExampleCacheProviderPCP)
def unavailable_preference(provider: PoTokenCacheProvider, request: PoTokenRequest):
return 1
assert len(_pot_cache_provider_preferences.value) == before + 1
def test_logger_log_level(logger):
assert logger.LogLevel('INFO') == logger.LogLevel.INFO
assert logger.LogLevel('debuG') == logger.LogLevel.DEBUG
assert logger.LogLevel(10) == logger.LogLevel.DEBUG
assert logger.LogLevel('UNKNOWN') == logger.LogLevel.INFO

View File

@ -23,7 +23,6 @@
TedTalkIE,
ThePlatformFeedIE,
ThePlatformIE,
VikiIE,
VimeoIE,
WallaIE,
YoutubeIE,
@ -331,20 +330,6 @@ def test_subtitles_array_key(self):
self.assertEqual(md5(subtitles['it']), '4b3264186fbb103508abe5311cfcb9cd')
@is_download_test
@unittest.skip('IE broken - DRM only')
class TestVikiSubtitles(BaseTestSubtitles):
url = 'http://www.viki.com/videos/1060846v-punch-episode-18'
IE = VikiIE
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), {'en'})
self.assertEqual(md5(subtitles['en']), '53cb083a5914b2d84ef1ab67b880d18a')
@is_download_test
class TestThePlatformSubtitles(BaseTestSubtitles):
# from http://www.3playmedia.com/services-features/tools/integrations/theplatform/

View File

@ -219,11 +219,8 @@ def test_sanitize_ids(self):
self.assertEqual(sanitize_filename('_BD_eEpuzXw', is_id=True), '_BD_eEpuzXw')
self.assertEqual(sanitize_filename('N0Y__7-UOdI', is_id=True), 'N0Y__7-UOdI')
@unittest.mock.patch('sys.platform', 'win32')
def test_sanitize_path(self):
with unittest.mock.patch('sys.platform', 'win32'):
self._test_sanitize_path()
def _test_sanitize_path(self):
self.assertEqual(sanitize_path('abc'), 'abc')
self.assertEqual(sanitize_path('abc/def'), 'abc\\def')
self.assertEqual(sanitize_path('abc\\def'), 'abc\\def')
@ -254,10 +251,8 @@ def _test_sanitize_path(self):
# Check with nt._path_normpath if available
try:
import nt
nt_path_normpath = getattr(nt, '_path_normpath', None)
except Exception:
from nt import _path_normpath as nt_path_normpath
except ImportError:
nt_path_normpath = None
for test, expected in [
@ -664,6 +659,8 @@ def test_url_or_none(self):
self.assertEqual(url_or_none('mms://foo.de'), 'mms://foo.de')
self.assertEqual(url_or_none('rtspu://foo.de'), 'rtspu://foo.de')
self.assertEqual(url_or_none('ftps://foo.de'), 'ftps://foo.de')
self.assertEqual(url_or_none('ws://foo.de'), 'ws://foo.de')
self.assertEqual(url_or_none('wss://foo.de'), 'wss://foo.de')
def test_parse_age_limit(self):
self.assertEqual(parse_age_limit(None), None)
@ -1265,6 +1262,7 @@ def test_js_to_json_edgecases(self):
def test_js_to_json_malformed(self):
self.assertEqual(js_to_json('42a1'), '42"a1"')
self.assertEqual(js_to_json('42a-1'), '42"a"-1')
self.assertEqual(js_to_json('{a: `${e("")}`}'), '{"a": "\\"e\\"(\\"\\")"}')
def test_js_to_json_template_literal(self):
self.assertEqual(js_to_json('`Hello ${name}`', {'name': '"world"'}), '"Hello world"')

View File

@ -78,6 +78,61 @@
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xxAj7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJ2OySqa0q',
),
(
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'AAOAOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7vgpDL0QwbdV06sCIEzpWqMGkFR20CFOS21Tp-7vj_EMu-m37KtXJoOy1',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/20830619/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
),
]
_NSIG_TESTS = [
@ -205,6 +260,66 @@
'https://www.youtube.com/s/player/9c6dfc4a/player_ias.vflset/en_US/base.js',
'jbu7ylIosQHyJyJV', 'uwI0ESiynAmhNg',
),
(
'https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js',
'Sy4aDGc0VpYRR9ew_', '5UPOT1VhoZxNLQ',
),
(
'https://www.youtube.com/s/player/d50f54ef/player_ias_tce.vflset/en_US/base.js',
'Ha7507LzRmH3Utygtj', 'XFTb2HoeOE5MHg',
),
(
'https://www.youtube.com/s/player/074a8365/player_ias_tce.vflset/en_US/base.js',
'Ha7507LzRmH3Utygtj', 'ufTsrE0IVYrkl8v',
),
(
'https://www.youtube.com/s/player/643afba4/player_ias.vflset/en_US/base.js',
'N5uAlLqm0eg1GyHO', 'dCBQOejdq5s-ww',
),
(
'https://www.youtube.com/s/player/69f581a5/tv-player-ias.vflset/tv-player-ias.js',
'-qIP447rVlTTwaZjY', 'KNcGOksBAvwqQg',
),
(
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
'ir9-V6cdbCiyKxhr', '2PL7ZDYAALMfmA',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'eWYu5d5YeY_4LyEDc', 'XJQqf-N7Xra3gg',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias_tce.vflset/en_US/base.js',
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
),
(
'https://www.youtube.com/s/player/20830619/tv-player-ias.vflset/tv-player-ias.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
),
(
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
),
(
'https://www.youtube.com/s/player/59b252b9/player_ias.vflset/en_US/base.js',
'D3XWVpYgwhLLKNK4AGX', 'aZrQ1qWJ5yv5h',
),
]
@ -218,6 +333,8 @@ def test_youtube_extract_player_info(self):
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-de_DE.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-tablet-en_US.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js', 'e7567ecf'),
('https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js', '643afba4'),
# obsolete
('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
@ -250,46 +367,51 @@ def t_factory(name, sig_func, url_pattern):
def make_tfunc(url, sig_input, expected_sig):
m = url_pattern.match(url)
assert m, f'{url!r} should follow URL format'
test_id = m.group('id')
test_id = re.sub(r'[/.-]', '_', m.group('id') or m.group('compat_id'))
def test_func(self):
basename = f'player-{name}-{test_id}.js'
basename = f'player-{test_id}.js'
fn = os.path.join(self.TESTDATA_DIR, basename)
if not os.path.exists(fn):
urllib.request.urlretrieve(url, fn)
with open(fn, encoding='utf-8') as testf:
jscode = testf.read()
self.assertEqual(sig_func(jscode, sig_input), expected_sig)
self.assertEqual(sig_func(jscode, sig_input, url), expected_sig)
test_func.__name__ = f'test_{name}_js_{test_id}'
setattr(TestSignature, test_func.__name__, test_func)
return make_tfunc
def signature(jscode, sig_input):
func = YoutubeIE(FakeYDL())._parse_sig_js(jscode)
def signature(jscode, sig_input, player_url):
func = YoutubeIE(FakeYDL())._parse_sig_js(jscode, player_url)
src_sig = (
str(string.printable[:sig_input])
if isinstance(sig_input, int) else sig_input)
return func(src_sig)
def n_sig(jscode, sig_input):
def n_sig(jscode, sig_input, player_url):
ie = YoutubeIE(FakeYDL())
funcname = ie._extract_n_function_name(jscode)
funcname = ie._extract_n_function_name(jscode, player_url=player_url)
jsi = JSInterpreter(jscode)
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname)))
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname), jscode, player_url))
return func([sig_input])
make_sig_test = t_factory(
'signature', signature, re.compile(r'.*(?:-|/player/)(?P<id>[a-zA-Z0-9_-]+)(?:/.+\.js|(?:/watch_as3|/html5player)?\.[a-z]+)$'))
'signature', signature,
re.compile(r'''(?x)
.+(?:
/player/(?P<id>[a-zA-Z0-9_/.-]+)|
/html5player-(?:en_US-)?(?P<compat_id>[a-zA-Z0-9_-]+)(?:/watch_as3|/html5player)?
)\.js$'''))
for test_spec in _SIG_TESTS:
make_sig_test(*test_spec)
make_nsig_test = t_factory(
'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_-]+)/.+.js$'))
'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_/.-]+)\.js$'))
for test_spec in _NSIG_TESTS:
make_nsig_test(*test_spec)

View File

@ -640,6 +640,7 @@ def __init__(self, params=None, auto_init=True):
self._printed_messages = set()
self._first_webpage_request = True
self._post_hooks = []
self._close_hooks = []
self._progress_hooks = []
self._postprocessor_hooks = []
self._download_retcode = 0
@ -654,19 +655,21 @@ def __init__(self, params=None, auto_init=True):
if not all_plugins_loaded.value:
load_all_plugins()
try:
windows_enable_vt_mode()
except Exception as e:
self.write_debug(f'Failed to enable VT mode: {e}')
stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout
self._out_files = Namespace(
out=stdout,
error=sys.stderr,
screen=sys.stderr if self.params.get('quiet') else stdout,
console=next(filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None),
)
try:
windows_enable_vt_mode()
except Exception as e:
self.write_debug(f'Failed to enable VT mode: {e}')
# hehe "immutable" namespace
self._out_files.console = next(filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None)
if self.params.get('no_color'):
if self.params.get('color') is not None:
self.params.setdefault('_warnings', []).append(
@ -906,6 +909,11 @@ def add_post_hook(self, ph):
"""Add the post hook"""
self._post_hooks.append(ph)
def add_close_hook(self, ch):
"""Add a close hook, called when YoutubeDL.close() is called"""
assert callable(ch), 'Close hook must be callable'
self._close_hooks.append(ch)
def add_progress_hook(self, ph):
"""Add the download progress hook"""
self._progress_hooks.append(ph)
@ -1014,6 +1022,9 @@ def close(self):
self._request_director.close()
del self._request_director
for close_hook in self._close_hooks:
close_hook()
def trouble(self, message=None, tb=None, is_error=True):
"""Determine action to take when a download problem appears.
@ -4150,7 +4161,7 @@ def _get_available_impersonate_targets(self):
(target, rh.RH_NAME)
for rh in self._request_director.handlers.values()
if isinstance(rh, ImpersonateRequestHandler)
for target in rh.supported_targets
for target in reversed(rh.supported_targets)
]
def _impersonate_target_available(self, target):

View File

@ -1021,8 +1021,9 @@ def _real_main(argv=None):
# List of simplified targets we know are supported,
# to help users know what dependencies may be required.
(ImpersonateTarget('chrome'), 'curl_cffi'),
(ImpersonateTarget('edge'), 'curl_cffi'),
(ImpersonateTarget('safari'), 'curl_cffi'),
(ImpersonateTarget('firefox'), 'curl_cffi>=0.10'),
(ImpersonateTarget('edge'), 'curl_cffi'),
]
available_targets = ydl._get_available_impersonate_targets()
@ -1038,12 +1039,12 @@ def make_row(target, handler):
for known_target, known_handler in known_targets:
if not any(
known_target in target and handler == known_handler
known_target in target and known_handler.startswith(handler)
for target, handler in available_targets
):
rows.append([
rows.insert(0, [
ydl._format_out(text, ydl.Styles.SUPPRESS)
for text in make_row(known_target, f'{known_handler} (not available)')
for text in make_row(known_target, f'{known_handler} (unavailable)')
])
ydl.to_screen('[info] Available impersonate targets')

View File

@ -83,7 +83,7 @@ def aes_ecb_encrypt(data, key, iv=None):
@returns {int[]} encrypted data
"""
expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
encrypted_data = []
for i in range(block_count):
@ -103,7 +103,7 @@ def aes_ecb_decrypt(data, key, iv=None):
@returns {int[]} decrypted data
"""
expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
encrypted_data = []
for i in range(block_count):
@ -134,7 +134,7 @@ def aes_ctr_encrypt(data, key, iv):
@returns {int[]} encrypted data
"""
expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
counter = iter_vector(iv)
encrypted_data = []
@ -158,7 +158,7 @@ def aes_cbc_decrypt(data, key, iv):
@returns {int[]} decrypted data
"""
expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
decrypted_data = []
previous_cipher_block = iv
@ -183,7 +183,7 @@ def aes_cbc_encrypt(data, key, iv, *, padding_mode='pkcs7'):
@returns {int[]} encrypted data
"""
expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
encrypted_data = []
previous_cipher_block = iv

View File

@ -764,11 +764,11 @@ def _get_linux_desktop_environment(env, logger):
GetDesktopEnvironment
"""
xdg_current_desktop = env.get('XDG_CURRENT_DESKTOP', None)
desktop_session = env.get('DESKTOP_SESSION', None)
desktop_session = env.get('DESKTOP_SESSION', '')
if xdg_current_desktop is not None:
for part in map(str.strip, xdg_current_desktop.split(':')):
if part == 'Unity':
if desktop_session is not None and 'gnome-fallback' in desktop_session:
if 'gnome-fallback' in desktop_session:
return _LinuxDesktopEnvironment.GNOME
else:
return _LinuxDesktopEnvironment.UNITY
@ -797,9 +797,8 @@ def _get_linux_desktop_environment(env, logger):
return _LinuxDesktopEnvironment.UKUI
elif part == 'LXQt':
return _LinuxDesktopEnvironment.LXQT
logger.info(f'XDG_CURRENT_DESKTOP is set to an unknown value: "{xdg_current_desktop}"')
logger.debug(f'XDG_CURRENT_DESKTOP is set to an unknown value: "{xdg_current_desktop}"')
elif desktop_session is not None:
if desktop_session == 'deepin':
return _LinuxDesktopEnvironment.DEEPIN
elif desktop_session in ('mate', 'gnome'):
@ -816,9 +815,8 @@ def _get_linux_desktop_environment(env, logger):
elif desktop_session == 'ukui':
return _LinuxDesktopEnvironment.UKUI
else:
logger.info(f'DESKTOP_SESSION is set to an unknown value: "{desktop_session}"')
logger.debug(f'DESKTOP_SESSION is set to an unknown value: "{desktop_session}"')
else:
if 'GNOME_DESKTOP_SESSION_ID' in env:
return _LinuxDesktopEnvironment.GNOME
elif 'KDE_FULL_SESSION' in env:
@ -826,6 +824,7 @@ def _get_linux_desktop_environment(env, logger):
return _LinuxDesktopEnvironment.KDE4
else:
return _LinuxDesktopEnvironment.KDE3
return _LinuxDesktopEnvironment.OTHER

View File

@ -30,7 +30,7 @@ def get_suitable_downloader(info_dict, params={}, default=NO_DEFAULT, protocol=N
from .http import HttpFD
from .ism import IsmFD
from .mhtml import MhtmlFD
from .niconico import NiconicoDmcFD, NiconicoLiveFD
from .niconico import NiconicoLiveFD
from .rtmp import RtmpFD
from .rtsp import RtspFD
from .websocket import WebSocketFragmentFD
@ -50,7 +50,6 @@ def get_suitable_downloader(info_dict, params={}, default=NO_DEFAULT, protocol=N
'http_dash_segments_generator': DashSegmentsFD,
'ism': IsmFD,
'mhtml': MhtmlFD,
'niconico_dmc': NiconicoDmcFD,
'niconico_live': NiconicoLiveFD,
'fc2_live': FC2LiveFD,
'websocket_frag': WebSocketFragmentFD,
@ -67,7 +66,6 @@ def shorten_protocol_name(proto, simplify=False):
'rtmp_ffmpeg': 'rtmpF',
'http_dash_segments': 'dash',
'http_dash_segments_generator': 'dashG',
'niconico_dmc': 'dmc',
'websocket_frag': 'WSfrag',
}
if simplify:

View File

@ -2,60 +2,12 @@
import threading
import time
from . import get_suitable_downloader
from .common import FileDownloader
from .external import FFmpegFD
from ..networking import Request
from ..utils import DownloadError, str_or_none, try_get
class NiconicoDmcFD(FileDownloader):
""" Downloading niconico douga from DMC with heartbeat """
def real_download(self, filename, info_dict):
from ..extractor.niconico import NiconicoIE
self.to_screen(f'[{self.FD_NAME}] Downloading from DMC')
ie = NiconicoIE(self.ydl)
info_dict, heartbeat_info_dict = ie._get_heartbeat_info(info_dict)
fd = get_suitable_downloader(info_dict, params=self.params)(self.ydl, self.params)
success = download_complete = False
timer = [None]
heartbeat_lock = threading.Lock()
heartbeat_url = heartbeat_info_dict['url']
heartbeat_data = heartbeat_info_dict['data'].encode()
heartbeat_interval = heartbeat_info_dict.get('interval', 30)
request = Request(heartbeat_url, heartbeat_data)
def heartbeat():
try:
self.ydl.urlopen(request).read()
except Exception:
self.to_screen(f'[{self.FD_NAME}] Heartbeat failed')
with heartbeat_lock:
if not download_complete:
timer[0] = threading.Timer(heartbeat_interval, heartbeat)
timer[0].start()
heartbeat_info_dict['ping']()
self.to_screen('[%s] Heartbeat with %d second interval ...' % (self.FD_NAME, heartbeat_interval))
try:
heartbeat()
if type(fd).__name__ == 'HlsFD':
info_dict.update(ie._extract_m3u8_formats(info_dict['url'], info_dict['id'])[0])
success = fd.real_download(filename, info_dict)
finally:
if heartbeat_lock:
with heartbeat_lock:
timer[0].cancel()
download_complete = True
return success
class NiconicoLiveFD(FileDownloader):
""" Downloads niconico live without being stopped """
@ -85,6 +37,7 @@ def communicate_ws(reconnect):
'quality': live_quality,
'protocol': 'hls+fmp4',
'latency': live_latency,
'accessRightMethod': 'single_cookie',
'chasePlay': False,
},
'room': {

View File

@ -300,7 +300,6 @@
BrainPOPIlIE,
BrainPOPJrIE,
)
from .bravotv import BravoTVIE
from .breitbart import BreitBartIE
from .brightcove import (
BrightcoveLegacyIE,
@ -338,7 +337,6 @@
from .canalplus import CanalplusIE
from .canalsurmas import CanalsurmasIE
from .caracoltv import CaracolTvPlayIE
from .cartoonnetwork import CartoonNetworkIE
from .cbc import (
CBCIE,
CBCGemIE,
@ -496,10 +494,6 @@
from .daystar import DaystarClipIE
from .dbtv import DBTVIE
from .dctp import DctpTvIE
from .deezer import (
DeezerAlbumIE,
DeezerPlaylistIE,
)
from .democracynow import DemocracynowIE
from .detik import DetikEmbedIE
from .deuxm import (
@ -687,6 +681,7 @@
)
from .foxsports import FoxSportsIE
from .fptplay import FptplayIE
from .francaisfacile import FrancaisFacileIE
from .franceinter import FranceInterIE
from .francetv import (
FranceTVIE,
@ -843,6 +838,7 @@
from .ichinanalive import (
IchinanaLiveClipIE,
IchinanaLiveIE,
IchinanaLiveVODIE,
)
from .idolplus import IdolPlusIE
from .ign import (
@ -905,6 +901,7 @@
IviIE,
)
from .ivideon import IvideonIE
from .ivoox import IvooxIE
from .iwara import (
IwaraIE,
IwaraPlaylistIE,
@ -930,7 +927,10 @@
)
from .jiosaavn import (
JioSaavnAlbumIE,
JioSaavnArtistIE,
JioSaavnPlaylistIE,
JioSaavnShowIE,
JioSaavnShowPlaylistIE,
JioSaavnSongIE,
)
from .joj import JojIE
@ -962,7 +962,10 @@
)
from .kicker import KickerIE
from .kickstarter import KickStarterIE
from .kika import KikaIE
from .kika import (
KikaIE,
KikaPlaylistIE,
)
from .kinja import KinjaEmbedIE
from .kinopoisk import KinoPoiskIE
from .kommunetv import KommunetvIE
@ -1040,6 +1043,7 @@
LimelightMediaIE,
)
from .linkedin import (
LinkedInEventsIE,
LinkedInIE,
LinkedInLearningCourseIE,
LinkedInLearningIE,
@ -1055,6 +1059,7 @@
)
from .livestreamfails import LivestreamfailsIE
from .lnk import LnkIE
from .loco import LocoIE
from .loom import (
LoomFolderIE,
LoomIE,
@ -1062,6 +1067,7 @@
from .lovehomeporn import LoveHomePornIE
from .lrt import (
LRTVODIE,
LRTRadioIE,
LRTStreamIE,
)
from .lsm import (
@ -1255,6 +1261,7 @@
)
from .nbc import (
NBCIE,
BravoTVIE,
NBCNewsIE,
NBCOlympicsIE,
NBCOlympicsStreamIE,
@ -1262,6 +1269,7 @@
NBCSportsStreamIE,
NBCSportsVPlayerIE,
NBCStationsIE,
SyfyIE,
)
from .ndr import (
NDRIE,
@ -1494,6 +1502,10 @@
)
from .parler import ParlerIE
from .parlview import ParlviewIE
from .parti import (
PartiLivestreamIE,
PartiVideoIE,
)
from .patreon import (
PatreonCampaignIE,
PatreonIE,
@ -1741,6 +1753,7 @@
RoosterTeethSeriesIE,
)
from .rottentomatoes import RottenTomatoesIE
from .roya import RoyaLiveIE
from .rozhlas import (
MujRozhlasIE,
RozhlasIE,
@ -1775,7 +1788,6 @@
from .rtve import (
RTVEALaCartaIE,
RTVEAudioIE,
RTVEInfantilIE,
RTVELiveIE,
RTVETelevisionIE,
)
@ -1956,7 +1968,6 @@
SpreakerShowIE,
)
from .springboardplatform import SpringboardPlatformIE
from .sprout import SproutIE
from .sproutvideo import (
SproutVideoIE,
VidsIoIE,
@ -1989,6 +2000,7 @@
StoryFireSeriesIE,
StoryFireUserIE,
)
from .streaks import StreaksIE
from .streamable import StreamableIE
from .streamcz import StreamCZIE
from .streetvoice import StreetVoiceIE
@ -2012,7 +2024,6 @@
SVTSeriesIE,
)
from .swearnet import SwearnetEpisodeIE
from .syfy import SyfyIE
from .syvdk import SYVDKIE
from .sztvhu import SztvHuIE
from .tagesschau import TagesschauIE
@ -2137,6 +2148,7 @@
from .toggo import ToggoIE
from .tonline import TOnlineIE
from .toongoggles import ToonGogglesIE
from .toutiao import ToutiaoIE
from .toutv import TouTvIE
from .toypics import (
ToypicsIE,
@ -2228,7 +2240,10 @@
TVPlayIE,
)
from .tvplayer import TVPlayerIE
from .tvw import TvwIE
from .tvw import (
TvwIE,
TvwTvChannelsIE,
)
from .tweakers import TweakersIE
from .twentymin import TwentyMinutenIE
from .twentythreevideo import TwentyThreeVideoIE
@ -2352,14 +2367,11 @@
ViewLiftIE,
)
from .viidea import ViideaIE
from .viki import (
VikiChannelIE,
VikiIE,
)
from .vimeo import (
VHXEmbedIE,
VimeoAlbumIE,
VimeoChannelIE,
VimeoEventIE,
VimeoGroupsIE,
VimeoIE,
VimeoLikesIE,
@ -2400,10 +2412,15 @@
VoxMediaIE,
VoxMediaVolumeIE,
)
from .vrsquare import (
VrSquareChannelIE,
VrSquareIE,
VrSquareSearchIE,
VrSquareSectionIE,
)
from .vrt import (
VRTIE,
DagelijkseKostIE,
KetnetIE,
Radio1BeIE,
VrtNUIE,
)

View File

@ -21,6 +21,7 @@
int_or_none,
time_seconds,
traverse_obj,
update_url,
update_url_query,
)
@ -417,6 +418,10 @@ def _real_extract(self, url):
'is_live': is_live,
'availability': availability,
})
if thumbnail := update_url(self._og_search_thumbnail(webpage, default=''), query=None):
info['thumbnails'] = [{'url': thumbnail}]
return info

View File

@ -3,6 +3,7 @@
import re
import time
import urllib.parse
import uuid
import xml.etree.ElementTree as etree
from .common import InfoExtractor
@ -10,6 +11,7 @@
from ..utils import (
NO_DEFAULT,
ExtractorError,
parse_qs,
unescapeHTML,
unified_timestamp,
urlencode_postdata,
@ -45,6 +47,8 @@
'name': 'Comcast XFINITY',
'username_field': 'user',
'password_field': 'passwd',
'login_hostname': 'login.xfinity.com',
'needs_newer_ua': True,
},
'TWC': {
'name': 'Time Warner Cable | Spectrum',
@ -74,6 +78,12 @@
'name': 'Verizon FiOS',
'username_field': 'IDToken1',
'password_field': 'IDToken2',
'login_hostname': 'ssoauth.verizon.com',
},
'Fubo': {
'name': 'Fubo',
'username_field': 'username',
'password_field': 'password',
},
'Cablevision': {
'name': 'Optimum/Cablevision',
@ -1338,6 +1348,7 @@
'name': 'Sling TV',
'username_field': 'username',
'password_field': 'password',
'login_hostname': 'identity.sling.com',
},
'Suddenlink': {
'name': 'Suddenlink',
@ -1355,7 +1366,6 @@
class AdobePassIE(InfoExtractor): # XXX: Conventionally, base classes should end with BaseIE/InfoExtractor
_SERVICE_PROVIDER_TEMPLATE = 'https://sp.auth.adobe.com/adobe-services/%s'
_USER_AGENT = 'Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0'
_MODERN_USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; rv:131.0) Gecko/20100101 Firefox/131.0'
_MVPD_CACHE = 'ap-mvpd'
_DOWNLOADING_LOGIN_PAGE = 'Downloading Provider Login Page'
@ -1367,6 +1377,14 @@ def _download_webpage_handle(self, *args, **kwargs):
return super()._download_webpage_handle(
*args, **kwargs)
@staticmethod
def _get_mso_headers(mso_info):
# yt-dlp's default user-agent is usually too old for some MSO's like Comcast_SSO
# See: https://github.com/yt-dlp/yt-dlp/issues/10848
return {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:131.0) Gecko/20100101 Firefox/131.0',
} if mso_info.get('needs_newer_ua') else {}
@staticmethod
def _get_mvpd_resource(provider_id, title, guid, rating):
channel = etree.Element('channel')
@ -1382,7 +1400,13 @@ def _get_mvpd_resource(provider_id, title, guid, rating):
resource_rating.text = rating
return '<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/">' + etree.tostring(channel).decode() + '</rss>'
def _extract_mvpd_auth(self, url, video_id, requestor_id, resource):
def _extract_mvpd_auth(self, url, video_id, requestor_id, resource, software_statement):
mso_id = self.get_param('ap_mso')
if mso_id:
mso_info = MSO_INFO[mso_id]
else:
mso_info = {}
def xml_text(xml_str, tag):
return self._search_regex(
f'<{tag}>(.+?)</{tag}>', xml_str, tag)
@ -1391,15 +1415,27 @@ def is_expired(token, date_ele):
token_expires = unified_timestamp(re.sub(r'[_ ]GMT', '', xml_text(token, date_ele)))
return token_expires and token_expires <= int(time.time())
def post_form(form_page_res, note, data={}):
def post_form(form_page_res, note, data={}, validate_url=False):
form_page, urlh = form_page_res
post_url = self._html_search_regex(r'<form[^>]+action=(["\'])(?P<url>.+?)\1', form_page, 'post url', group='url')
if not re.match(r'https?://', post_url):
post_url = urllib.parse.urljoin(urlh.url, post_url)
if validate_url:
# This request is submitting credentials so we should validate it when possible
url_parsed = urllib.parse.urlparse(post_url)
expected_hostname = mso_info.get('login_hostname')
if expected_hostname and expected_hostname != url_parsed.hostname:
raise ExtractorError(
f'Unexpected login URL hostname; expected "{expected_hostname}" but got '
f'"{url_parsed.hostname}". Aborting before submitting credentials')
if url_parsed.scheme != 'https':
self.write_debug('Upgrading login URL scheme to https')
post_url = urllib.parse.urlunparse(url_parsed._replace(scheme='https'))
form_data = self._hidden_inputs(form_page)
form_data.update(data)
return self._download_webpage_handle(
post_url, video_id, note, data=urlencode_postdata(form_data), headers={
**self._get_mso_headers(mso_info),
'Content-Type': 'application/x-www-form-urlencoded',
})
@ -1432,19 +1468,58 @@ def extract_redirect_url(html, url=None, fatal=False):
}
guid = xml_text(resource, 'guid') if '<' in resource else resource
count = 0
while count < 2:
for _ in range(2):
requestor_info = self.cache.load(self._MVPD_CACHE, requestor_id) or {}
authn_token = requestor_info.get('authn_token')
if authn_token and is_expired(authn_token, 'simpleTokenExpires'):
authn_token = None
if not authn_token:
mso_id = self.get_param('ap_mso')
if mso_id:
if not mso_id:
raise_mvpd_required()
username, password = self._get_login_info('ap_username', 'ap_password', mso_id)
if not username or not password:
raise_mvpd_required()
mso_info = MSO_INFO[mso_id]
device_info, urlh = self._download_json_handle(
'https://sp.auth.adobe.com/indiv/devices',
video_id, 'Registering device with Adobe',
data=json.dumps({'fingerprint': uuid.uuid4().hex}).encode(),
headers={'Content-Type': 'application/json; charset=UTF-8'})
device_id = device_info['deviceId']
mvpd_headers['pass_sfp'] = urlh.get_header('pass_sfp')
mvpd_headers['Ap_21'] = device_id
registration = self._download_json(
'https://sp.auth.adobe.com/o/client/register',
video_id, 'Registering client with Adobe',
data=json.dumps({'software_statement': software_statement}).encode(),
headers={'Content-Type': 'application/json; charset=UTF-8'})
access_token = self._download_json(
'https://sp.auth.adobe.com/o/client/token', video_id,
'Obtaining access token', data=urlencode_postdata({
'grant_type': 'client_credentials',
'client_id': registration['client_id'],
'client_secret': registration['client_secret'],
}),
headers={
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
})['access_token']
mvpd_headers['Authorization'] = f'Bearer {access_token}'
reg_code = self._download_json(
f'https://sp.auth.adobe.com/reggie/v1/{requestor_id}/regcode',
video_id, 'Obtaining registration code',
data=urlencode_postdata({
'requestor': requestor_id,
'deviceId': device_id,
'format': 'json',
}),
headers={
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Authorization': f'Bearer {access_token}',
})['code']
provider_redirect_page_res = self._download_webpage_handle(
self._SERVICE_PROVIDER_TEMPLATE % 'authenticate/saml', video_id,
@ -1455,17 +1530,10 @@ def extract_redirect_url(html, url=None, fatal=False):
'no_iframe': 'false',
'domain_name': 'adobe.com',
'redirect_url': url,
}, headers={
# yt-dlp's default user-agent is usually too old for Comcast_SSO
# See: https://github.com/yt-dlp/yt-dlp/issues/10848
'User-Agent': self._MODERN_USER_AGENT,
} if mso_id == 'Comcast_SSO' else None)
elif not self._cookies_passed:
raise_mvpd_required()
'reg_code': reg_code,
}, headers=self._get_mso_headers(mso_info))
if not mso_id:
pass
elif mso_id == 'Comcast_SSO':
if mso_id == 'Comcast_SSO':
# Comcast page flow varies by video site and whether you
# are on Comcast's network.
provider_redirect_page, urlh = provider_redirect_page_res
@ -1489,8 +1557,8 @@ def extract_redirect_url(html, url=None, fatal=False):
oauth_redirect_url = extract_redirect_url(
provider_redirect_page, fatal=True)
provider_login_page_res = self._download_webpage_handle(
oauth_redirect_url, video_id,
self._DOWNLOADING_LOGIN_PAGE)
oauth_redirect_url, video_id, self._DOWNLOADING_LOGIN_PAGE,
headers=self._get_mso_headers(mso_info))
else:
provider_login_page_res = post_form(
provider_redirect_page_res,
@ -1500,7 +1568,7 @@ def extract_redirect_url(html, url=None, fatal=False):
provider_login_page_res, 'Logging in', {
mso_info['username_field']: username,
mso_info['password_field']: password,
})
}, validate_url=True)
mvpd_confirm_page, urlh = mvpd_confirm_page_res
if '<button class="submit" value="Resume">Resume</button>' in mvpd_confirm_page:
post_form(mvpd_confirm_page_res, 'Confirming Login')
@ -1539,7 +1607,7 @@ def extract_redirect_url(html, url=None, fatal=False):
provider_redirect_page_res, 'Logging in', {
mso_info['username_field']: username,
mso_info['password_field']: password,
})
}, validate_url=True)
saml_login_page, urlh = saml_login_page_res
if 'Please try again.' in saml_login_page:
raise ExtractorError(
@ -1560,7 +1628,7 @@ def extract_redirect_url(html, url=None, fatal=False):
[saml_login_page, saml_redirect_url], 'Logging in', {
mso_info['username_field']: username,
mso_info['password_field']: password,
})
}, validate_url=True)
if 'Please try again.' in saml_login_page:
raise ExtractorError(
'Failed to login, incorrect User ID or Password.')
@ -1631,7 +1699,7 @@ def extract_redirect_url(html, url=None, fatal=False):
provider_login_page_res, 'Logging in', {
mso_info['username_field']: username,
mso_info['password_field']: password,
})
}, validate_url=True)
provider_refresh_redirect_url = extract_redirect_url(
provider_association_redirect, url=urlh.url)
@ -1682,7 +1750,7 @@ def extract_redirect_url(html, url=None, fatal=False):
provider_login_page_res, 'Logging in', {
mso_info['username_field']: username,
mso_info['password_field']: password,
})
}, validate_url=True)
provider_refresh_redirect_url = extract_redirect_url(
provider_association_redirect, url=urlh.url)
@ -1699,6 +1767,27 @@ def extract_redirect_url(html, url=None, fatal=False):
query=hidden_data)
post_form(mvpd_confirm_page_res, 'Confirming Login')
elif mso_id == 'Fubo':
_, urlh = provider_redirect_page_res
fubo_response = self._download_json(
'https://api.fubo.tv/partners/tve/connect', video_id,
'Authenticating with Fubo', 'Unable to authenticate with Fubo',
query=parse_qs(urlh.url), data=json.dumps({
'username': username,
'password': password,
}).encode(), headers={
'Accept': 'application/json',
'Content-Type': 'application/json',
})
self._request_webpage(
'https://sp.auth.adobe.com/adobe-services/oauth2', video_id,
'Authenticating with Adobe', 'Failed to authenticate with Adobe',
query={
'code': fubo_response['code'],
'state': fubo_response['state'],
})
else:
# Some providers (e.g. DIRECTV NOW) have another meta refresh
# based redirect that should be followed.
@ -1717,7 +1806,8 @@ def extract_redirect_url(html, url=None, fatal=False):
}
if mso_id in ('Cablevision', 'AlticeOne'):
form_data['_eventId_proceed'] = ''
mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', form_data)
mvpd_confirm_page_res = post_form(
provider_login_page_res, 'Logging in', form_data, validate_url=True)
if mso_id != 'Rogers':
post_form(mvpd_confirm_page_res, 'Confirming Login')
@ -1727,6 +1817,7 @@ def extract_redirect_url(html, url=None, fatal=False):
'Retrieving Session', data=urlencode_postdata({
'_method': 'GET',
'requestor_id': requestor_id,
'reg_code': reg_code,
}), headers=mvpd_headers)
except ExtractorError as e:
if not mso_id and isinstance(e.cause, HTTPError) and e.cause.status == 401:
@ -1734,7 +1825,6 @@ def extract_redirect_url(html, url=None, fatal=False):
raise
if '<pendingLogout' in session:
self.cache.store(self._MVPD_CACHE, requestor_id, {})
count += 1
continue
authn_token = unescapeHTML(xml_text(session, 'authnToken'))
requestor_info['authn_token'] = authn_token
@ -1755,7 +1845,6 @@ def extract_redirect_url(html, url=None, fatal=False):
}), headers=mvpd_headers)
if '<pendingLogout' in authorize:
self.cache.store(self._MVPD_CACHE, requestor_id, {})
count += 1
continue
if '<error' in authorize:
raise ExtractorError(xml_text(authorize, 'details'), expected=True)
@ -1778,6 +1867,5 @@ def extract_redirect_url(html, url=None, fatal=False):
}), headers=mvpd_headers)
if '<pendingLogout' in short_authorize:
self.cache.store(self._MVPD_CACHE, requestor_id, {})
count += 1
continue
return short_authorize

View File

@ -84,6 +84,8 @@ class AdultSwimIE(TurnerBaseIE):
'skip': '404 Not Found',
}]
_SOFTWARE_STATEMENT = 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiIwNjg5ZmU2My00OTc5LTQxZmQtYWYxNC1hYjVlNmJjNWVkZWIiLCJuYmYiOjE1MzcxOTA2NzQsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTM3MTkwNjc0fQ.Xl3AEduM0s1TxDQ6-XssdKIiLm261hhsEv1C1yo_nitIajZThSI9rXILqtIzO0aujoHhdzUnu_dUCq9ffiSBzEG632tTa1la-5tegHtce80cMhewBN4n2t8n9O5tiaPx8MPY8ALdm5wS7QzWE6DO_LTJKgE8Bl7Yv-CWJT4q4SywtNiQWLVOuhBRnDyfsRezxRwptw8qTn9dv5ZzUrVJaby5fDZ_nOncMKvegOgaKd5KEuCAGQ-mg-PSuValMjGuf6FwDguGaK7IyI5Y2oOrzXmD4Dj7q4WBg8w9QoZhtLeAU56mcsGILolku2R5FHlVLO9xhjResyt-pfmegOkpSw'
def _real_extract(self, url):
show_path, episode_path = self._match_valid_url(url).groups()
display_id = episode_path or show_path
@ -152,7 +154,7 @@ def _real_extract(self, url):
# CDN_TOKEN_APP_ID from:
# https://d2gg02c3xr550i.cloudfront.net/assets/asvp.e9c8bef24322d060ef87.bundle.js
'appId': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhcHBJZCI6ImFzLXR2ZS1kZXNrdG9wLXB0enQ2bSIsInByb2R1Y3QiOiJ0dmUiLCJuZXR3b3JrIjoiYXMiLCJwbGF0Zm9ybSI6ImRlc2t0b3AiLCJpYXQiOjE1MzI3MDIyNzl9.BzSCk-WYOZ2GMCIaeVb8zWnzhlgnXuJTCu0jGp_VaZE',
}, {
}, self._SOFTWARE_STATEMENT, {
'url': url,
'site_name': 'AdultSwim',
'auth_required': auth,

View File

@ -20,13 +20,13 @@ class AENetworksBaseIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
_THEPLATFORM_KEY = '43jXaGRQud'
_THEPLATFORM_SECRET = 'S10BPXHMlb'
_DOMAIN_MAP = {
'history.com': ('HISTORY', 'history'),
'aetv.com': ('AETV', 'aetv'),
'mylifetime.com': ('LIFETIME', 'lifetime'),
'lifetimemovieclub.com': ('LIFETIMEMOVIECLUB', 'lmc'),
'fyi.tv': ('FYI', 'fyi'),
'historyvault.com': (None, 'historyvault'),
'biography.com': (None, 'biography'),
'history.com': ('HISTORY', 'history', 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiI1MzZlMTQ3ZS0zMzFhLTQxY2YtYTMwNC01MDA2NzNlOGYwYjYiLCJuYmYiOjE1Mzg2NjMzMDksImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTM4NjYzMzA5fQ.n24-FVHLGXJe2D4atIQZ700aiXKIajKh5PWFoHJ40Az4itjtwwSFHnvufnoal3T8lYkwNLxce7H-IEGxIykRkZEdwq09pMKMT-ft9ASzE4vQ8fAWbf5ZgDME86x4Jq_YaxkRc9Ne0eShGhl8fgTJHvk07sfWcol61HJ7kU7K8FzzcHR0ucFQgA5VNd8RyjoGWY7c6VxnXR214LOpXsywmit04-vGJC102b_WA2EQfqI93UzG6M6l0EeV4n0_ijP3s8_i8WMJZ_uwnTafCIY6G_731i01dKXDLSFzG1vYglAwDa8DTcdrAAuIFFDF6QNGItCCmwbhjufjmoeVb7R1Gg'),
'aetv.com': ('AETV', 'aetv', 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiI5Y2IwNjg2Yy03ODUxLTRiZDUtODcyMC00MjNlZTg1YTQ1NzMiLCJuYmYiOjE1Mzg2NjMyOTAsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTM4NjYzMjkwfQ.T5Elf0X4TndO4NEgqBas1gDxNHGPVk_daO2Ha5FBzVO6xi3zM7eavdAKfYMCN7gpWYJx03iADaVPtczO_t_aGZczDjpwJHgTUzDgvcLZAVsVDqtDIAMy3S846rPgT6UDbVoxurA7B2VTPm9phjrSXhejvd0LBO8MQL4AZ3sy2VmiPJ2noT1ily5PuHCYlkrT1fheO064duR__Cd9DQ5VTMnKjzY3Cx345CEwKDkUk5gwgxhXM-aY0eblehrq8VD81_aRM_O3tvh7nbTydHOnUpV-k_iKVi49gqz7Sf8zb6Zh5z2Uftn3vYCfE5NQuesitoRMnsH17nW7o_D59hkRgg'),
'mylifetime.com': ('LIFETIME', 'lifetime', 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJmODg0MDM1ZC1mZGRmLTRmYjgtYmRkMC05MzRhZDdiYTAwYTciLCJuYmYiOjE1NDkzOTI2NDQsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTQ5MzkyNjQ0fQ.vkTIaCpheKdKQd__2-3ec4qkcpbAhyCTvwe5iTl922ItSQfVhpEJG4wseVSNmBTrpBi0hvLedcw6Hj1_UuzBMVuVcCqLprU-pI8recEwL0u7G-eVkylsxe1OTUm1o3V6OykXQ9KlA-QQLL1neUhdhR1n5B1LZ4cmtBmiEpfgf4rFwXD1ScFylIcaWKLBqHoRBNUmxyTmoXXvn_A-GGSj9eCizFzY8W5uBwUcsoiw2Cr1skx7PbB2RSP1I5DsoIJKG-8XV1KS7MWl-fNLjE-hVAsI9znqfEEFcPBiv3LhCP4Nf4OIs7xAselMn0M0c8igRUZhURWX_hdygUAxkbKFtQ'),
'fyi.tv': ('FYI', 'fyi', 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiIxOGZiOWM3Ny1mYmMzLTQxYTktYmE1Yi1lMzM0ZmUzNzU4NjEiLCJuYmYiOjE1ODc1ODAzNzcsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTg3NTgwMzc3fQ.AYDuipKswmIfLBfOjHRsfc5fMV5NmJUmiJnkpiep4VEw9QiXkygFj4bN06Si5tFc5Mee5TDrGzDpV6iuKbVpLT5kuqXhAn-Wozf5zKPsg_IpdEKO7gsiCq4calt72ct44KTqtKD_hVcoxQU24_HaJsRgXzu3B-6Ff6UrmsXkyvYifYVC9v2DSkdCuA02_IrlllzVT2kRuefUXgL4vQRtTFf77uYa0RKSTG7uVkiQ_AU41eXevKlO2qgtc14Hk5cZ7-ZNrDyMCXYA5ngdIHP7Gs9PWaFXT36PFHI_rC4EfxUABPzjQFxjpP75aX5qn8SH__HbM9q3hoPWgaEaf76qIQ'),
'lifetimemovieclub.com': ('LIFETIMEMOVIECLUB', 'lmc', None),
'historyvault.com': (None, 'historyvault', None),
'biography.com': (None, 'biography', None),
}
def _extract_aen_smil(self, smil_url, video_id, auth=None):
@ -71,7 +71,7 @@ def _extract_aen_smil(self, smil_url, video_id, auth=None):
}
def _extract_aetn_info(self, domain, filter_key, filter_value, url):
requestor_id, brand = self._DOMAIN_MAP[domain]
requestor_id, brand, software_statement = self._DOMAIN_MAP[domain]
result = self._download_json(
f'https://feeds.video.aetnd.com/api/v2/{brand}/videos',
filter_value, query={f'filter[{filter_key}]': filter_value})
@ -95,7 +95,7 @@ def _extract_aetn_info(self, domain, filter_key, filter_value, url):
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
traverse_obj(theplatform_metadata, ('ratings', 0, 'rating')))
auth = self._extract_mvpd_auth(
url, video_id, requestor_id, resource)
url, video_id, requestor_id, resource, software_statement)
info.update(self._extract_aen_smil(media_url, video_id, auth))
info.update({
'title': title,
@ -132,10 +132,11 @@ class AENetworksIE(AENetworksBaseIE):
'tags': 'count:14',
'categories': ['Mountain Men'],
'episode_number': 1,
'episode': 'Episode 1',
'episode': 'Winter Is Coming',
'season': 'Season 1',
'season_number': 1,
'series': 'Mountain Men',
'age_limit': 0,
},
'params': {
# m3u8 download
@ -157,18 +158,18 @@ class AENetworksIE(AENetworksBaseIE):
'thumbnail': r're:^https?://.*\.jpe?g$',
'chapters': 'count:4',
'tags': 'count:23',
'episode': 'Episode 1',
'episode': 'Inlawful Entry',
'episode_number': 1,
'season': 'Season 9',
'season_number': 9,
'series': 'Duck Dynasty',
'age_limit': 0,
},
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'],
'skip': 'This video is only available for users of participating TV providers.',
}, {
'url': 'http://www.fyi.tv/shows/tiny-house-nation/season-1/episode-8',
'only_matching': True,

View File

@ -1,3 +1,4 @@
import datetime as dt
import functools
from .common import InfoExtractor
@ -10,7 +11,7 @@
filter_dict,
int_or_none,
orderedSet,
unified_timestamp,
parse_iso8601,
url_or_none,
urlencode_postdata,
urljoin,
@ -87,9 +88,9 @@ class AfreecaTVIE(AfreecaTVBaseIE):
'uploader_id': 'rlantnghks',
'uploader': '페이즈으',
'duration': 10840,
'thumbnail': r're:https?://videoimg\.sooplive\.co/.kr/.+',
'thumbnail': r're:https?://videoimg\.(?:sooplive\.co\.kr|afreecatv\.com)/.+',
'upload_date': '20230108',
'timestamp': 1673218805,
'timestamp': 1673186405,
'title': '젠지 페이즈',
},
'params': {
@ -102,7 +103,7 @@ class AfreecaTVIE(AfreecaTVBaseIE):
'id': '20170411_BE689A0E_190960999_1_2_h',
'ext': 'mp4',
'title': '혼자사는여자집',
'thumbnail': r're:https?://(?:video|st)img\.sooplive\.co\.kr/.+',
'thumbnail': r're:https?://(?:video|st)img\.(?:sooplive\.co\.kr|afreecatv\.com)/.+',
'uploader': '♥이슬이',
'uploader_id': 'dasl8121',
'upload_date': '20170411',
@ -119,7 +120,7 @@ class AfreecaTVIE(AfreecaTVBaseIE):
'id': '20180327_27901457_202289533_1',
'ext': 'mp4',
'title': '[생]빨개요♥ (part 1)',
'thumbnail': r're:https?://(?:video|st)img\.sooplive\.co\.kr/.+',
'thumbnail': r're:https?://(?:video|st)img\.(?:sooplive\.co\.kr|afreecatv\.com)/.+',
'uploader': '[SA]서아',
'uploader_id': 'bjdyrksu',
'upload_date': '20180327',
@ -187,7 +188,7 @@ def _real_extract(self, url):
'formats': formats,
**traverse_obj(file_element, {
'duration': ('duration', {int_or_none(scale=1000)}),
'timestamp': ('file_start', {unified_timestamp}),
'timestamp': ('file_start', {parse_iso8601(delimiter=' ', timezone=dt.timedelta(hours=9))}),
}),
})
@ -370,7 +371,7 @@ def _real_extract(self, url):
'title': channel_info.get('TITLE') or station_info.get('station_title'),
'uploader': channel_info.get('BJNICK') or station_info.get('station_name'),
'uploader_id': broadcaster_id,
'timestamp': unified_timestamp(station_info.get('broad_start')),
'timestamp': parse_iso8601(station_info.get('broad_start'), delimiter=' ', timezone=dt.timedelta(hours=9)),
'formats': formats,
'is_live': True,
'http_headers': {'Referer': url},

View File

@ -146,7 +146,7 @@ class TokFMPodcastIE(InfoExtractor):
'url': 'https://audycje.tokfm.pl/podcast/91275,-Systemowy-rasizm-Czy-zamieszki-w-USA-po-morderstwie-w-Minneapolis-doprowadza-do-zmian-w-sluzbach-panstwowych',
'info_dict': {
'id': '91275',
'ext': 'aac',
'ext': 'mp3',
'title': 'md5:a9b15488009065556900169fb8061cce',
'episode': 'md5:a9b15488009065556900169fb8061cce',
'series': 'Analizy',
@ -164,23 +164,20 @@ def _real_extract(self, url):
raise ExtractorError('No such podcast', expected=True)
metadata = metadata[0]
formats = []
for ext in ('aac', 'mp3'):
url_data = self._download_json(
f'https://api.podcast.radioagora.pl/api4/getSongUrl?podcast_id={media_id}&device_id={uuid.uuid4()}&ppre=false&audio={ext}',
media_id, f'Downloading podcast {ext} URL')
# prevents inserting the mp3 (default) multiple times
if 'link_ssl' in url_data and f'.{ext}' in url_data['link_ssl']:
formats.append({
'url': url_data['link_ssl'],
'ext': ext,
'vcodec': 'none',
'acodec': ext,
})
mp3_url = self._download_json(
'https://api.podcast.radioagora.pl/api4/getSongUrl',
media_id, 'Downloading podcast mp3 URL', query={
'podcast_id': media_id,
'device_id': str(uuid.uuid4()),
'ppre': 'false',
'audio': 'mp3',
})['link_ssl']
return {
'id': media_id,
'formats': formats,
'url': mp3_url,
'vcodec': 'none',
'ext': 'mp3',
'title': metadata.get('podcast_name'),
'series': metadata.get('series_name'),
'episode': metadata.get('podcast_name'),

View File

@ -1,32 +1,24 @@
import re
from .theplatform import ThePlatformIE
from ..utils import (
int_or_none,
parse_age_limit,
try_get,
update_url_query,
)
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor
from ..utils.traversal import traverse_obj
class AMCNetworksIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
_VALID_URL = r'https?://(?:www\.)?(?P<site>amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?P<id>(?:movies|shows(?:/[^/]+)+)/[^/?#&]+)'
class AMCNetworksIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?P<id>(?:movies|shows(?:/[^/?#]+)+)/[^/?#&]+)'
_TESTS = [{
'url': 'https://www.bbcamerica.com/shows/the-graham-norton-show/videos/tina-feys-adorable-airline-themed-family-dinner--51631',
'url': 'https://www.amc.com/shows/dark-winds/videos/dark-winds-a-look-at-season-3--1072027',
'info_dict': {
'id': '4Lq1dzOnZGt0',
'id': '6369261343112',
'ext': 'mp4',
'title': "The Graham Norton Show - Season 28 - Tina Fey's Adorable Airline-Themed Family Dinner",
'description': "It turns out child stewardesses are very generous with the wine! All-new episodes of 'The Graham Norton Show' premiere Fridays at 11/10c on BBC America.",
'upload_date': '20201120',
'timestamp': 1605904350,
'uploader': 'AMCN',
'title': 'Dark Winds: A Look at Season 3',
'uploader_id': '6240731308001',
'duration': 176.427,
'thumbnail': r're:https://[^/]+\.boltdns\.net/.+/image\.jpg',
'tags': [],
'timestamp': 1740414792,
'upload_date': '20250224',
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': '404 Not Found',
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://www.bbcamerica.com/shows/the-hunt/full-episodes/season-1/episode-01-the-hardest-challenge',
'only_matching': True,
@ -52,96 +44,18 @@ class AMCNetworksIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
'url': 'https://www.sundancetv.com/shows/riviera/full-episodes/season-1/episode-01-episode-1',
'only_matching': True,
}]
_REQUESTOR_ID_MAP = {
'amc': 'AMC',
'bbcamerica': 'BBCA',
'ifc': 'IFC',
'sundancetv': 'SUNDANCE',
'wetv': 'WETV',
}
def _real_extract(self, url):
site, display_id = self._match_valid_url(url).groups()
requestor_id = self._REQUESTOR_ID_MAP[site]
page_data = self._download_json(
f'https://content-delivery-gw.svc.ds.amcn.com/api/v2/content/amcn/{requestor_id.lower()}/url/{display_id}',
display_id)['data']
properties = page_data.get('properties') or {}
query = {
'mbr': 'true',
'manifest': 'm3u',
}
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
initial_data = self._search_json(
r'window\.initialData\s*=\s*JSON\.parse\(String\.raw`', webpage, 'initial data', display_id)
video_id = traverse_obj(initial_data, ('initialData', 'properties', 'videoId', {str}))
if not video_id: # All locked videos are now DRM-protected
self.report_drm(display_id)
account_id = initial_data['config']['brightcove']['accountId']
player_id = initial_data['config']['brightcove']['playerId']
video_player_count = 0
try:
for v in page_data['children']:
if v.get('type') == 'video-player':
release_pid = v['properties']['currentVideo']['meta']['releasePid']
tp_path = 'M_UwQC/' + release_pid
media_url = 'https://link.theplatform.com/s/' + tp_path
video_player_count += 1
except KeyError:
pass
if video_player_count > 1:
self.report_warning(
f'The JSON data has {video_player_count} video players. Only one will be extracted')
# Fall back to videoPid if releasePid not found.
# TODO: Fall back to videoPid if releasePid manifest uses DRM.
if not video_player_count:
tp_path = 'M_UwQC/media/' + properties['videoPid']
media_url = 'https://link.theplatform.com/s/' + tp_path
theplatform_metadata = self._download_theplatform_metadata(tp_path, display_id)
info = self._parse_theplatform_metadata(theplatform_metadata)
video_id = theplatform_metadata['pid']
title = theplatform_metadata['title']
rating = try_get(
theplatform_metadata, lambda x: x['ratings'][0]['rating'])
video_category = properties.get('videoCategory')
if video_category and video_category.endswith('-Auth'):
resource = self._get_mvpd_resource(
requestor_id, title, video_id, rating)
query['auth'] = self._extract_mvpd_auth(
url, video_id, requestor_id, resource)
media_url = update_url_query(media_url, query)
formats, subtitles = self._extract_theplatform_smil(
media_url, video_id)
thumbnails = []
thumbnail_urls = [properties.get('imageDesktop')]
if 'thumbnail' in info:
thumbnail_urls.append(info.pop('thumbnail'))
for thumbnail_url in thumbnail_urls:
if not thumbnail_url:
continue
mobj = re.search(r'(\d+)x(\d+)', thumbnail_url)
thumbnails.append({
'url': thumbnail_url,
'width': int(mobj.group(1)) if mobj else None,
'height': int(mobj.group(2)) if mobj else None,
})
info.update({
'age_limit': parse_age_limit(rating),
'formats': formats,
'id': video_id,
'subtitles': subtitles,
'thumbnails': thumbnails,
})
ns_keys = theplatform_metadata.get('$xmlns', {}).keys()
if ns_keys:
ns = next(iter(ns_keys))
episode = theplatform_metadata.get(ns + '$episodeTitle') or None
episode_number = int_or_none(
theplatform_metadata.get(ns + '$episode'))
season_number = int_or_none(
theplatform_metadata.get(ns + '$season'))
series = theplatform_metadata.get(ns + '$show') or None
info.update({
'episode': episode,
'episode_number': episode_number,
'season_number': season_number,
'series': series,
})
return info
return self.url_result(
f'https://players.brightcove.net/{account_id}/{player_id}_default/index.html?videoId={video_id}',
BrightcoveNewIE, video_id)

View File

@ -1,64 +1,105 @@
import urllib.parse
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
int_or_none,
parse_age_limit,
url_or_none,
urlencode_postdata,
)
from ..utils.traversal import traverse_obj
class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/[^/]+/[^/]+/(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})'
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/(?:[^/?#]+/){4}(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})'
_NETRC_MACHINE = 'atresplayer'
_TESTS = [
{
'url': 'https://www.atresplayer.com/antena3/series/pequenas-coincidencias/temporada-1/capitulo-7-asuntos-pendientes_5d4aa2c57ed1a88fc715a615/',
_TESTS = [{
'url': 'https://www.atresplayer.com/lasexta/programas/el-objetivo/clips/mbappe-describe-como-entrenador-a-carlo-ancelotti-sabe-cuando-tiene-que-ser-padre-jefe-amigo-entrenador_67f2dfb2fb6ab0e4c7203849/',
'info_dict': {
'id': '5d4aa2c57ed1a88fc715a615',
'ext': 'mp4',
'title': 'Capítulo 7: Asuntos pendientes',
'description': 'md5:7634cdcb4d50d5381bedf93efb537fbc',
'duration': 3413,
'id': '67f2dfb2fb6ab0e4c7203849',
'display_id': 'md5:c203f8d4e425ed115ba56a1c6e4b3e6c',
'title': 'Mbappé describe como entrenador a Carlo Ancelotti: "Sabe cuándo tiene que ser padre, jefe, amigo, entrenador..."',
'channel': 'laSexta',
'duration': 31,
'thumbnail': 'https://imagenes.atresplayer.com/atp/clipping/cmsimages02/2025/04/06/B02DBE1E-D59B-4683-8404-1A9595D15269/1920x1080.jpg',
'tags': ['Entrevista informativa', 'Actualidad', 'Debate informativo', 'Política', 'Economía', 'Sociedad', 'Cara a cara', 'Análisis', 'Más periodismo'],
'series': 'El Objetivo',
'season': 'Temporada 12',
'timestamp': 1743970079,
'upload_date': '20250406',
},
'skip': 'This video is only available for registered users',
}, {
'url': 'https://www.atresplayer.com/antena3/programas/el-hormiguero/clips/revive-la-entrevista-completa-a-miguel-bose-en-el-hormiguero_67f836baa4a5b0e4147ca59a/',
'info_dict': {
'ext': 'mp4',
'id': '67f836baa4a5b0e4147ca59a',
'display_id': 'revive-la-entrevista-completa-a-miguel-bose-en-el-hormiguero',
'title': 'Revive la entrevista completa a Miguel Bosé en El Hormiguero',
'description': 'md5:c6d2b591408d45a7bc2986dfb938eb72',
'channel': 'Antena 3',
'duration': 2556,
'thumbnail': 'https://imagenes.atresplayer.com/atp/clipping/cmsimages02/2025/04/10/9076395F-F1FD-48BE-9F18-540DBA10EBAD/1920x1080.jpg',
'tags': ['Entrevista', 'Variedades', 'Humor', 'Entretenimiento', 'Te sigo', 'Buen rollo', 'Cara a cara'],
'series': 'El Hormiguero ',
'season': 'Temporada 14',
'timestamp': 1744320111,
'upload_date': '20250410',
},
{
}, {
'url': 'https://www.atresplayer.com/flooxer/series/biara-proyecto-lazarus/temporada-1/capitulo-3-supervivientes_67a6038b64ceca00070f4f69/',
'info_dict': {
'ext': 'mp4',
'id': '67a6038b64ceca00070f4f69',
'display_id': 'capitulo-3-supervivientes',
'title': 'Capítulo 3: Supervivientes',
'description': 'md5:65b231f20302f776c2b0dd24594599a1',
'channel': 'Flooxer',
'duration': 1196,
'thumbnail': 'https://imagenes.atresplayer.com/atp/clipping/cmsimages01/2025/02/14/17CF90D3-FE67-40C5-A941-7825B3E13992/1920x1080.jpg',
'tags': ['Juvenil', 'Terror', 'Piel de gallina', 'Te sigo', 'Un break', 'Del tirón'],
'series': 'BIARA: Proyecto Lázarus',
'season': 'Temporada 1',
'season_number': 1,
'episode': 'Episode 3',
'episode_number': 3,
'timestamp': 1743095191,
'upload_date': '20250327',
},
}, {
'url': 'https://www.atresplayer.com/lasexta/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_5ad08edf986b2855ed47adc4/',
'only_matching': True,
},
{
}, {
'url': 'https://www.atresplayer.com/antena3/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_5ad51046986b2886722ccdea/',
'only_matching': True,
},
]
}]
_API_BASE = 'https://api.atresplayer.com/'
def _perform_login(self, username, password):
self._request_webpage(
self._API_BASE + 'login', None, 'Downloading login page')
try:
target_url = self._download_json(
'https://account.atresmedia.com/api/login', None,
'Logging in', headers={
'Content-Type': 'application/x-www-form-urlencoded',
}, data=urlencode_postdata({
self._download_webpage(
'https://account.atresplayer.com/auth/v1/login', None,
'Logging in', 'Failed to log in', data=urlencode_postdata({
'username': username,
'password': password,
}))['targetUrl']
}))
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 400:
raise ExtractorError('Invalid username and/or password', expected=True)
raise
self._request_webpage(target_url, None, 'Following Target URL')
def _real_extract(self, url):
display_id, video_id = self._match_valid_url(url).groups()
metadata_url = self._download_json(
self._API_BASE + 'client/v1/url', video_id, 'Downloading API endpoint data',
query={'href': urllib.parse.urlparse(url).path})['href']
metadata = self._download_json(metadata_url, video_id)
try:
episode = self._download_json(
self._API_BASE + 'client/v1/player/episode/' + video_id, video_id)
video_data = self._download_json(metadata['urlVideo'], video_id, 'Downloading video data')
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 403:
error = self._parse_json(e.cause.response.read(), None)
@ -67,37 +108,45 @@ def _real_extract(self, url):
raise ExtractorError(error['error_description'], expected=True)
raise
title = episode['titulo']
formats = []
subtitles = {}
for source in episode.get('sources', []):
src = source.get('src')
if not src:
continue
for source in traverse_obj(video_data, ('sources', lambda _, v: url_or_none(v['src']))):
src_url = source['src']
src_type = source.get('type')
if src_type == 'application/vnd.apple.mpegurl':
formats, subtitles = self._extract_m3u8_formats(
src, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False)
elif src_type == 'application/dash+xml':
formats, subtitles = self._extract_mpd_formats(
src, video_id, mpd_id='dash', fatal=False)
heartbeat = episode.get('heartbeat') or {}
omniture = episode.get('omniture') or {}
get_meta = lambda x: heartbeat.get(x) or omniture.get(x)
if src_type in ('application/vnd.apple.mpegurl', 'application/hls+legacy', 'application/hls+hevc'):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
src_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
elif src_type in ('application/dash+xml', 'application/dash+hevc'):
fmts, subs = self._extract_mpd_formats_and_subtitles(
src_url, video_id, mpd_id='dash', fatal=False)
else:
continue
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
'display_id': display_id,
'id': video_id,
'title': title,
'description': episode.get('descripcion'),
'thumbnail': episode.get('imgPoster'),
'duration': int_or_none(episode.get('duration')),
'formats': formats,
'channel': get_meta('channel'),
'season': get_meta('season'),
'episode_number': int_or_none(get_meta('episodeNumber')),
'subtitles': subtitles,
**traverse_obj(video_data, {
'title': ('titulo', {str}),
'description': ('descripcion', {str}),
'duration': ('duration', {int_or_none}),
'thumbnail': ('imgPoster', {url_or_none}, {lambda v: f'{v}1920x1080.jpg'}),
'age_limit': ('ageRating', {parse_age_limit}),
}),
**traverse_obj(metadata, {
'title': ('title', {str}),
'description': ('description', {str}),
'duration': ('duration', {int_or_none}),
'tags': ('tags', ..., 'title', {str}),
'age_limit': ('ageRating', {parse_age_limit}),
'series': ('format', 'title', {str}),
'season': ('currentSeason', 'title', {str}),
'season_number': ('currentSeason', 'seasonNumber', {int_or_none}),
'episode_number': ('numberOfEpisode', {int_or_none}),
'timestamp': ('publicationDate', {int_or_none(scale=1000)}),
'channel': ('channel', 'title', {str}),
}),
}

View File

@ -86,7 +86,7 @@ def _parse_video(self, video_data, url=None):
'webpage_url': (
'id', ({value(url)}, {format_field(template='https://www.bandlab.com/post/%s')}), filter, any),
'url': ('video', 'url', {url_or_none}),
'title': ('caption', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=50)}),
'title': ('caption', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=72)}),
'description': ('caption', {str}),
'thumbnail': ('video', 'picture', 'url', {url_or_none}),
'view_count': ('video', 'counters', 'plays', {int_or_none}),
@ -120,7 +120,7 @@ class BandlabIE(BandlabBaseIE):
'duration': 54.629999999999995,
'title': 'sweet black',
'upload_date': '20231210',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
'genres': ['Lofi'],
'uploader': 'ender milze',
'comment_count': int,
@ -142,7 +142,7 @@ class BandlabIE(BandlabBaseIE):
'duration': 54.629999999999995,
'title': 'sweet black',
'upload_date': '20231210',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
'genres': ['Lofi'],
'uploader': 'ender milze',
'comment_count': int,
@ -158,7 +158,7 @@ class BandlabIE(BandlabBaseIE):
'comment_count': int,
'genres': ['Other'],
'uploader_id': 'user8353034818103753',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/51b18363-da23-4b9b-a29c-2933a3e561ca/',
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/51b18363-da23-4b9b-a29c-2933a3e561ca/',
'timestamp': 1709625771,
'track': 'PodcastMaerchen4b',
'duration': 468.14,
@ -178,7 +178,7 @@ class BandlabIE(BandlabBaseIE):
'id': '110343fc-148b-ea11-96d2-0003ffd1fc09',
'ext': 'm4a',
'timestamp': 1588273294,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/users/b612e533-e4f7-4542-9f50-3fcfd8dd822c/',
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/users/b612e533-e4f7-4542-9f50-3fcfd8dd822c/',
'description': 'Final Revision.',
'title': 'Replay ( Instrumental)',
'uploader': 'David R Sparks',
@ -200,7 +200,7 @@ class BandlabIE(BandlabBaseIE):
'id': '5cdf9036-3857-ef11-991a-6045bd36e0d9',
'ext': 'mp4',
'duration': 44.705,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/videos/67c6cef1-cef6-40d3-831e-a55bc1dcb972/',
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/videos/67c6cef1-cef6-40d3-831e-a55bc1dcb972/',
'comment_count': int,
'title': 'backing vocals',
'uploader_id': 'marliashya',
@ -224,7 +224,7 @@ class BandlabIE(BandlabBaseIE):
'view_count': int,
'track': 'Positronic Meltdown',
'duration': 318.55,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/87165bc3-5439-496e-b1f7-a9f13b541ff2/',
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/87165bc3-5439-496e-b1f7-a9f13b541ff2/',
'description': 'Checkout my tracks at AOMX http://aomxsounds.com/',
'uploader_id': 'microfreaks',
'title': 'Positronic Meltdown',
@ -246,7 +246,7 @@ class BandlabIE(BandlabBaseIE):
'comment_count': int,
'uploader': 'Sorakime',
'uploader_id': 'sorakime',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/users/572a351a-0f3a-4c6a-ac39-1a5defdeeb1c/',
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/users/572a351a-0f3a-4c6a-ac39-1a5defdeeb1c/',
'timestamp': 1691162128,
'upload_date': '20230804',
'media_type': 'track',

View File

@ -1596,16 +1596,16 @@ def _real_extract(self, url):
webpage = self._download_webpage(url, list_id)
initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', list_id)
if traverse_obj(initial_state, ('error', 'code', {int_or_none})) != 200:
error_code = traverse_obj(initial_state, ('error', 'trueCode', {int_or_none}))
error_message = traverse_obj(initial_state, ('error', 'message', {str_or_none}))
error = traverse_obj(initial_state, (('error', 'listError'), all, lambda _, v: v['code'], any))
if error and error['code'] != 200:
error_code = error.get('trueCode')
if error_code == -400 and list_id == 'watchlater':
self.raise_login_required('You need to login to access your watchlater playlist')
elif error_code == -403:
self.raise_login_required('This is a private playlist. You need to login as its owner')
elif error_code == 11010:
raise ExtractorError('Playlist is no longer available', expected=True)
raise ExtractorError(f'Could not access playlist: {error_code} {error_message}')
raise ExtractorError(f'Could not access playlist: {error_code} {error.get("message")}')
query = {
'ps': 20,

View File

@ -1,30 +1,32 @@
import functools
import json
import re
from .common import InfoExtractor
from ..networking import HEADRequest
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
OnDemandPagedList,
clean_html,
extract_attributes,
determine_ext,
format_field,
get_element_by_class,
get_element_by_id,
get_element_html_by_class,
get_elements_html_by_class,
int_or_none,
orderedSet,
parse_count,
parse_duration,
traverse_obj,
unified_strdate,
parse_iso8601,
url_or_none,
urlencode_postdata,
urljoin,
)
from ..utils.traversal import traverse_obj
class BitChuteIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|old)\.)?bitchute\.com/(?:video|embed|torrent/[^/]+)/(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:(?:www|old)\.)?bitchute\.com/(?:video|embed|torrent/[^/?#]+)/(?P<id>[^/?#&]+)'
_EMBED_REGEX = [rf'<(?:script|iframe)[^>]+\bsrc=(["\'])(?P<url>{_VALID_URL})']
_TESTS = [{
'url': 'https://www.bitchute.com/video/UGlrF9o9b-Q/',
@ -34,12 +36,17 @@ class BitChuteIE(InfoExtractor):
'ext': 'mp4',
'title': 'This is the first video on #BitChute !',
'description': 'md5:a0337e7b1fe39e32336974af8173a034',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:https?://.+/.+\.jpg$',
'uploader': 'BitChute',
'upload_date': '20170103',
'uploader_url': 'https://www.bitchute.com/profile/I5NgtHZn9vPj/',
'channel': 'BitChute',
'channel_url': 'https://www.bitchute.com/channel/bitchute/',
'uploader_id': 'I5NgtHZn9vPj',
'channel_id': '1VBwRfyNcKdX',
'view_count': int,
'duration': 16.0,
'timestamp': 1483425443,
},
}, {
# test case: video with different channel and uploader
@ -49,13 +56,18 @@ class BitChuteIE(InfoExtractor):
'id': 'Yti_j9A-UZ4',
'ext': 'mp4',
'title': 'Israel at War | Full Measure',
'description': 'md5:38cf7bc6f42da1a877835539111c69ef',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:e60198b89971966d6030d22b3268f08f',
'thumbnail': r're:https?://.+/.+\.jpg$',
'uploader': 'sharylattkisson',
'upload_date': '20231106',
'uploader_url': 'https://www.bitchute.com/profile/9K0kUWA9zmd9/',
'channel': 'Full Measure with Sharyl Attkisson',
'channel_url': 'https://www.bitchute.com/channel/sharylattkisson/',
'uploader_id': '9K0kUWA9zmd9',
'channel_id': 'NpdxoCRv3ZLb',
'view_count': int,
'duration': 554.0,
'timestamp': 1699296106,
},
}, {
# video not downloadable in browser, but we can recover it
@ -66,25 +78,21 @@ class BitChuteIE(InfoExtractor):
'ext': 'mp4',
'filesize': 71537926,
'title': 'STYXHEXENHAMMER666 - Election Fraud, Clinton 2020, EU Armies, and Gun Control',
'description': 'md5:228ee93bd840a24938f536aeac9cf749',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:2029c7c212ccd4b040f52bb2d036ef4e',
'thumbnail': r're:https?://.+/.+\.jpg$',
'uploader': 'BitChute',
'upload_date': '20181113',
'uploader_url': 'https://www.bitchute.com/profile/I5NgtHZn9vPj/',
'channel': 'BitChute',
'channel_url': 'https://www.bitchute.com/channel/bitchute/',
'uploader_id': 'I5NgtHZn9vPj',
'channel_id': '1VBwRfyNcKdX',
'view_count': int,
'duration': 1701.0,
'tags': ['bitchute'],
'timestamp': 1542130287,
},
'params': {'check_formats': None},
}, {
# restricted video
'url': 'https://www.bitchute.com/video/WEnQU7XGcTdl/',
'info_dict': {
'id': 'WEnQU7XGcTdl',
'ext': 'mp4',
'title': 'Impartial Truth - Ein Letzter Appell an die Vernunft',
},
'params': {'skip_download': True},
'skip': 'Georestricted in DE',
}, {
'url': 'https://www.bitchute.com/embed/lbb5G1hjPhw/',
'only_matching': True,
@ -96,11 +104,8 @@ class BitChuteIE(InfoExtractor):
'only_matching': True,
}]
_GEO_BYPASS = False
_HEADERS = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.57 Safari/537.36',
'Referer': 'https://www.bitchute.com/',
}
_UPLOADER_URL_TMPL = 'https://www.bitchute.com/profile/%s/'
_CHANNEL_URL_TMPL = 'https://www.bitchute.com/channel/%s/'
def _check_format(self, video_url, video_id):
urls = orderedSet(
@ -112,7 +117,7 @@ def _check_format(self, video_url, video_id):
for url in urls:
try:
response = self._request_webpage(
HEADRequest(url), video_id=video_id, note=f'Checking {url}', headers=self._HEADERS)
HEADRequest(url), video_id=video_id, note=f'Checking {url}')
except ExtractorError as e:
self.to_screen(f'{video_id}: URL is invalid, skipping: {e.cause}')
continue
@ -121,54 +126,79 @@ def _check_format(self, video_url, video_id):
'filesize': int_or_none(response.headers.get('Content-Length')),
}
def _raise_if_restricted(self, webpage):
page_title = clean_html(get_element_by_class('page-title', webpage)) or ''
if re.fullmatch(r'(?:Channel|Video) Restricted', page_title):
reason = clean_html(get_element_by_id('page-detail', webpage)) or page_title
self.raise_geo_restricted(reason)
@staticmethod
def _make_url(html):
path = extract_attributes(get_element_html_by_class('spa', html) or '').get('href')
return urljoin('https://www.bitchute.com', path)
def _call_api(self, endpoint, data, display_id, fatal=True):
note = endpoint.rpartition('/')[2]
try:
return self._download_json(
f'https://api.bitchute.com/api/beta/{endpoint}', display_id,
f'Downloading {note} API JSON', f'Unable to download {note} API JSON',
data=json.dumps(data).encode(),
headers={
'Accept': 'application/json',
'Content-Type': 'application/json',
})
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 403:
errors = '. '.join(traverse_obj(e.cause.response.read().decode(), (
{json.loads}, 'errors', lambda _, v: v['context'] == 'reason', 'message', {str})))
if errors and 'location' in errors:
# Can always be fatal since the video/media call will reach this code first
self.raise_geo_restricted(errors)
if fatal:
raise
self.report_warning(e.msg)
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
f'https://old.bitchute.com/video/{video_id}', video_id, headers=self._HEADERS)
self._raise_if_restricted(webpage)
publish_date = clean_html(get_element_by_class('video-publish-date', webpage))
entries = self._parse_html5_media_entries(url, webpage, video_id)
data = {'video_id': video_id}
media_url = self._call_api('video/media', data, video_id)['media_url']
formats = []
for format_ in traverse_obj(entries, (0, 'formats', ...)):
if determine_ext(media_url) == 'm3u8':
formats.extend(
self._extract_m3u8_formats(media_url, video_id, 'mp4', m3u8_id='hls', live=True))
else:
if self.get_param('check_formats') is not False:
format_.update(self._check_format(format_.pop('url'), video_id) or {})
if 'url' not in format_:
continue
formats.append(format_)
if fmt := self._check_format(media_url, video_id):
formats.append(fmt)
else:
formats.append({'url': media_url})
if not formats:
self.raise_no_formats(
'Video is unavailable. Please make sure this video is playable in the browser '
'before reporting this issue.', expected=True, video_id=video_id)
details = get_element_by_class('details', webpage) or ''
uploader_html = get_element_html_by_class('creator', details) or ''
channel_html = get_element_html_by_class('name', details) or ''
video = self._call_api('video', data, video_id, fatal=False)
channel = None
if channel_id := traverse_obj(video, ('channel', 'channel_id', {str})):
channel = self._call_api('channel', {'channel_id': channel_id}, video_id, fatal=False)
return {
**traverse_obj(video, {
'title': ('video_name', {str}),
'description': ('description', {str}),
'thumbnail': ('thumbnail_url', {url_or_none}),
'channel': ('channel', 'channel_name', {str}),
'channel_id': ('channel', 'channel_id', {str}),
'channel_url': ('channel', 'channel_url', {urljoin('https://www.bitchute.com/')}),
'uploader_id': ('profile_id', {str}),
'uploader_url': ('profile_id', {format_field(template=self._UPLOADER_URL_TMPL)}, filter),
'timestamp': ('date_published', {parse_iso8601}),
'duration': ('duration', {parse_duration}),
'tags': ('hashtags', ..., {str}, filter, all, filter),
'view_count': ('view_count', {int_or_none}),
'is_live': ('state_id', {lambda x: x == 'live'}),
}),
**traverse_obj(channel, {
'channel': ('channel_name', {str}),
'channel_id': ('channel_id', {str}),
'channel_url': ('url_slug', {format_field(template=self._CHANNEL_URL_TMPL)}, filter),
'uploader': ('profile_name', {str}),
'uploader_id': ('profile_id', {str}),
'uploader_url': ('profile_id', {format_field(template=self._UPLOADER_URL_TMPL)}, filter),
}),
'id': video_id,
'title': self._html_extract_title(webpage) or self._og_search_title(webpage),
'description': self._og_search_description(webpage, default=None),
'thumbnail': self._og_search_thumbnail(webpage),
'uploader': clean_html(uploader_html),
'uploader_url': self._make_url(uploader_html),
'channel': clean_html(channel_html),
'channel_url': self._make_url(channel_html),
'upload_date': unified_strdate(self._search_regex(
r'at \d+:\d+ UTC on (.+?)\.', publish_date, 'upload date', fatal=False)),
'formats': formats,
}
@ -190,7 +220,7 @@ class BitChuteChannelIE(InfoExtractor):
'ext': 'mp4',
'title': 'This is the first video on #BitChute !',
'description': 'md5:a0337e7b1fe39e32336974af8173a034',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:https?://.+/.+\.jpg$',
'uploader': 'BitChute',
'upload_date': '20170103',
'uploader_url': 'https://www.bitchute.com/profile/I5NgtHZn9vPj/',
@ -198,6 +228,9 @@ class BitChuteChannelIE(InfoExtractor):
'channel_url': 'https://www.bitchute.com/channel/bitchute/',
'duration': 16,
'view_count': int,
'uploader_id': 'I5NgtHZn9vPj',
'channel_id': '1VBwRfyNcKdX',
'timestamp': 1483425443,
},
},
],
@ -213,6 +246,7 @@ class BitChuteChannelIE(InfoExtractor):
'title': 'Bruce MacDonald and "The Light of Darkness"',
'description': 'md5:747724ef404eebdfc04277714f81863e',
},
'skip': '404 Not Found',
}, {
'url': 'https://old.bitchute.com/playlist/wV9Imujxasw9/',
'only_matching': True,

View File

@ -53,7 +53,7 @@ class BlueskyIE(InfoExtractor):
'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur',
'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur',
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
'title': 'Bluesky now has video! Update your app to versi...',
'title': 'Bluesky now has video! Update your app to version 1.91 or refresh on ...',
'alt_title': 'Bluesky video feature announcement',
'description': r're:(?s)Bluesky now has video! .{239}',
'upload_date': '20240911',
@ -172,7 +172,7 @@ class BlueskyIE(InfoExtractor):
'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur',
'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur',
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
'title': 'Bluesky now has video! Update your app to versi...',
'title': 'Bluesky now has video! Update your app to version 1.91 or refresh on ...',
'alt_title': 'Bluesky video feature announcement',
'description': r're:(?s)Bluesky now has video! .{239}',
'upload_date': '20240911',
@ -191,7 +191,7 @@ class BlueskyIE(InfoExtractor):
'info_dict': {
'id': '3l7rdfxhyds2f',
'ext': 'mp4',
'uploader': 'cinnamon',
'uploader': 'cinnamon 🐇 🏳️‍⚧️',
'uploader_id': 'cinny.bun.how',
'uploader_url': 'https://bsky.app/profile/cinny.bun.how',
'channel_id': 'did:plc:7x6rtuenkuvxq3zsvffp2ide',
@ -255,7 +255,7 @@ class BlueskyIE(InfoExtractor):
'info_dict': {
'id': '3l77u64l7le2e',
'ext': 'mp4',
'title': 'hearing people on twitter say that bluesky isn\'...',
'title': "hearing people on twitter say that bluesky isn't funny yet so post t...",
'like_count': int,
'uploader_id': 'thafnine.net',
'uploader_url': 'https://bsky.app/profile/thafnine.net',
@ -387,7 +387,7 @@ def _extract_videos(self, root, video_id, embed_path='embed', record_path='recor
'age_limit': (
'labels', ..., 'val', {lambda x: 18 if x in ('sexual', 'porn', 'graphic-media') else None}, any),
'description': (*record_path, 'text', {str}, filter),
'title': (*record_path, 'text', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=50)}),
'title': (*record_path, 'text', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=72)}),
}),
})
return entries

View File

@ -24,7 +24,7 @@ def _extract_bokecc_formats(self, webpage, video_id, format_id=None):
class BokeCCIE(BokeCCBaseIE):
_IE_DESC = 'CC视频'
IE_DESC = 'CC视频'
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{

View File

@ -7,6 +7,7 @@
join_nonempty,
js_to_json,
mimetype2ext,
parse_resolution,
unified_strdate,
url_or_none,
urljoin,
@ -110,24 +111,23 @@ def _parse_vue_attributes(self, name, string, video_id):
return attributes
@staticmethod
def _process_source(source):
def _process_source(self, source):
url = url_or_none(source['src'])
if not url:
return None
source_type = source.get('type', '')
extension = mimetype2ext(source_type)
is_video = source_type.startswith('video')
note = url.rpartition('.')[0].rpartition('_')[2] if is_video else None
note = self._search_regex(r'[_-]([a-z]+)\.[\da-z]+(?:$|\?)', url, 'note', default=None)
return {
'url': url,
'ext': extension,
'vcodec': None if is_video else 'none',
'vcodec': None if source_type.startswith('video') else 'none',
'quality': 10 if note == 'high' else 0,
'format_note': note,
'format_id': join_nonempty(extension, note),
**parse_resolution(source.get('label')),
}
def _real_extract(self, url):

View File

@ -1,188 +0,0 @@
from .adobepass import AdobePassIE
from ..networking import HEADRequest
from ..utils import (
extract_attributes,
float_or_none,
get_element_html_by_class,
int_or_none,
merge_dicts,
parse_age_limit,
remove_end,
str_or_none,
traverse_obj,
unescapeHTML,
unified_timestamp,
update_url_query,
url_or_none,
)
class BravoTVIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?(?P<site>bravotv|oxygen)\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
'info_dict': {
'id': '3923059',
'ext': 'mp4',
'title': 'The Top Chef Season 16 Winner Is...',
'description': 'Find out who takes the title of Top Chef!',
'upload_date': '20190314',
'timestamp': 1552591860,
'season_number': 16,
'episode_number': 15,
'series': 'Top Chef',
'episode': 'The Top Chef Season 16 Winner Is...',
'duration': 190.357,
'season': 'Season 16',
'thumbnail': r're:^https://.+\.jpg',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.bravotv.com/top-chef/season-20/episode-1/london-calling',
'info_dict': {
'id': '9000234570',
'ext': 'mp4',
'title': 'London Calling',
'description': 'md5:5af95a8cbac1856bd10e7562f86bb759',
'upload_date': '20230310',
'timestamp': 1678410000,
'season_number': 20,
'episode_number': 1,
'series': 'Top Chef',
'episode': 'London Calling',
'duration': 3266.03,
'season': 'Season 20',
'chapters': 'count:7',
'thumbnail': r're:^https://.+\.jpg',
'age_limit': 14,
},
'params': {'skip_download': 'm3u8'},
'skip': 'This video requires AdobePass MSO credentials',
}, {
'url': 'https://www.oxygen.com/in-ice-cold-blood/season-1/closing-night',
'info_dict': {
'id': '3692045',
'ext': 'mp4',
'title': 'Closing Night',
'description': 'md5:3170065c5c2f19548d72a4cbc254af63',
'upload_date': '20180401',
'timestamp': 1522623600,
'season_number': 1,
'episode_number': 1,
'series': 'In Ice Cold Blood',
'episode': 'Closing Night',
'duration': 2629.051,
'season': 'Season 1',
'chapters': 'count:6',
'thumbnail': r're:^https://.+\.jpg',
'age_limit': 14,
},
'params': {'skip_download': 'm3u8'},
'skip': 'This video requires AdobePass MSO credentials',
}, {
'url': 'https://www.oxygen.com/in-ice-cold-blood/season-2/episode-16/videos/handling-the-horwitz-house-after-the-murder-season-2',
'info_dict': {
'id': '3974019',
'ext': 'mp4',
'title': '\'Handling The Horwitz House After The Murder (Season 2, Episode 16)',
'description': 'md5:f9d638dd6946a1c1c0533a9c6100eae5',
'upload_date': '20190617',
'timestamp': 1560790800,
'season_number': 2,
'episode_number': 16,
'series': 'In Ice Cold Blood',
'episode': '\'Handling The Horwitz House After The Murder (Season 2, Episode 16)',
'duration': 68.235,
'season': 'Season 2',
'thumbnail': r're:^https://.+\.jpg',
'age_limit': 14,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
'only_matching': True,
}]
def _real_extract(self, url):
site, display_id = self._match_valid_url(url).group('site', 'id')
webpage = self._download_webpage(url, display_id)
settings = self._search_json(
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>', webpage, 'settings', display_id)
tve = extract_attributes(get_element_html_by_class('tve-video-deck-app', webpage) or '')
query = {
'manifest': 'm3u',
'formats': 'm3u,mpeg4',
}
if tve:
account_pid = tve.get('data-mpx-media-account-pid') or 'HNK2IC'
account_id = tve['data-mpx-media-account-id']
metadata = self._parse_json(
tve.get('data-normalized-video', ''), display_id, fatal=False, transform_source=unescapeHTML)
video_id = tve.get('data-guid') or metadata['guid']
if tve.get('data-entitlement') == 'auth':
auth = traverse_obj(settings, ('tve_adobe_auth', {dict})) or {}
site = remove_end(site, 'tv')
release_pid = tve['data-release-pid']
resource = self._get_mvpd_resource(
tve.get('data-adobe-pass-resource-id') or auth.get('adobePassResourceId') or site,
tve['data-title'], release_pid, tve.get('data-rating'))
query.update({
'switch': 'HLSServiceSecure',
'auth': self._extract_mvpd_auth(
url, release_pid, auth.get('adobePassRequestorId') or site, resource),
})
else:
ls_playlist = traverse_obj(settings, ('ls_playlist', ..., {dict}), get_all=False) or {}
account_pid = ls_playlist.get('mpxMediaAccountPid') or 'PHSl-B'
account_id = ls_playlist['mpxMediaAccountId']
video_id = ls_playlist['defaultGuid']
metadata = traverse_obj(
ls_playlist, ('videos', lambda _, v: v['guid'] == video_id, {dict}), get_all=False)
tp_url = f'https://link.theplatform.com/s/{account_pid}/media/guid/{account_id}/{video_id}'
tp_metadata = self._download_json(
update_url_query(tp_url, {'format': 'preview'}), video_id, fatal=False)
chapters = traverse_obj(tp_metadata, ('chapters', ..., {
'start_time': ('startTime', {float_or_none(scale=1000)}),
'end_time': ('endTime', {float_or_none(scale=1000)}),
}))
# prune pointless single chapters that span the entire duration from short videos
if len(chapters) == 1 and not traverse_obj(chapters, (0, 'end_time')):
chapters = None
m3u8_url = self._request_webpage(HEADRequest(
update_url_query(f'{tp_url}/stream.m3u8', query)), video_id, 'Checking m3u8 URL').url
if 'mpeg_cenc' in m3u8_url:
self.report_drm(video_id)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, 'mp4', m3u8_id='hls')
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
'chapters': chapters,
**merge_dicts(traverse_obj(tp_metadata, {
'title': 'title',
'description': 'description',
'duration': ('duration', {float_or_none(scale=1000)}),
'timestamp': ('pubDate', {float_or_none(scale=1000)}),
'season_number': (('pl1$seasonNumber', 'nbcu$seasonNumber'), {int_or_none}),
'episode_number': (('pl1$episodeNumber', 'nbcu$episodeNumber'), {int_or_none}),
'series': (('pl1$show', 'nbcu$show'), (None, ...), {str}),
'episode': (('title', 'pl1$episodeNumber', 'nbcu$episodeNumber'), {str_or_none}),
'age_limit': ('ratings', ..., 'rating', {parse_age_limit}),
}, get_all=False), traverse_obj(metadata, {
'title': 'title',
'description': 'description',
'duration': ('durationInSeconds', {int_or_none}),
'timestamp': ('airDate', {unified_timestamp}),
'thumbnail': ('thumbnailUrl', {url_or_none}),
'season_number': ('seasonNumber', {int_or_none}),
'episode_number': ('episodeNumber', {int_or_none}),
'episode': 'episodeTitle',
'series': 'show',
})),
}

View File

@ -923,10 +923,18 @@ def extract_policy_key():
errors = json_data.get('errors')
if errors and errors[0].get('error_subcode') == 'TVE_AUTH':
custom_fields = json_data['custom_fields']
missing_fields = ', '.join(
key for key in ('source_url', 'software_statement') if not smuggled_data.get(key))
if missing_fields:
raise ExtractorError(
f'Missing fields in smuggled data: {missing_fields}. '
f'This video can be only extracted from the webpage where it is embedded. '
f'Pass the URL of the embedding webpage instead of the Brightcove URL', expected=True)
tve_token = self._extract_mvpd_auth(
smuggled_data['source_url'], video_id,
custom_fields['bcadobepassrequestorid'],
custom_fields['bcadobepassresourceid'])
custom_fields['bcadobepassresourceid'],
smuggled_data['software_statement'])
json_data = self._download_json(
api_url, video_id, headers={
'Accept': f'application/json;pk={policy_key}',

View File

@ -1,59 +0,0 @@
from .turner import TurnerBaseIE
from ..utils import int_or_none
class CartoonNetworkIE(TurnerBaseIE):
_VALID_URL = r'https?://(?:www\.)?cartoonnetwork\.com/video/(?:[^/]+/)+(?P<id>[^/?#]+)-(?:clip|episode)\.html'
_TEST = {
'url': 'https://www.cartoonnetwork.com/video/ben-10/how-to-draw-upgrade-episode.html',
'info_dict': {
'id': '6e3375097f63874ebccec7ef677c1c3845fa850e',
'ext': 'mp4',
'title': 'How to Draw Upgrade',
'description': 'md5:2061d83776db7e8be4879684eefe8c0f',
},
'params': {
# m3u8 download
'skip_download': True,
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
def find_field(global_re, name, content_re=None, value_re='[^"]+', fatal=False):
metadata_re = ''
if content_re:
metadata_re = r'|video_metadata\.content_' + content_re
return self._search_regex(
rf'(?:_cnglobal\.currentVideo\.{global_re}{metadata_re})\s*=\s*"({value_re})";',
webpage, name, fatal=fatal)
media_id = find_field('mediaId', 'media id', 'id', '[0-9a-f]{40}', True)
title = find_field('episodeTitle', 'title', '(?:episodeName|name)', fatal=True)
info = self._extract_ngtv_info(
media_id, {'networkId': 'cartoonnetwork'}, {
'url': url,
'site_name': 'CartoonNetwork',
'auth_required': find_field('authType', 'auth type') != 'unauth',
})
series = find_field(
'propertyName', 'series', 'showName') or self._html_search_meta('partOfSeries', webpage)
info.update({
'id': media_id,
'display_id': display_id,
'title': title,
'description': self._html_search_meta('description', webpage),
'series': series,
'episode': title,
})
for field in ('season', 'episode'):
field_name = field + 'Number'
info[field + '_number'] = int_or_none(find_field(
field_name, field + ' number', value_re=r'\d+') or self._html_search_meta(field_name, webpage))
return info

View File

@ -13,16 +13,17 @@
from ..utils import (
ExtractorError,
OnDemandPagedList,
determine_ext,
float_or_none,
int_or_none,
merge_dicts,
multipart_encode,
parse_duration,
traverse_obj,
try_call,
try_get,
url_or_none,
urljoin,
)
from ..utils.traversal import traverse_obj
class CDAIE(InfoExtractor):
@ -290,15 +291,16 @@ def extract_format(page, version):
if not video or 'file' not in video:
self.report_warning(f'Unable to extract {version} version information')
return
video_quality = video.get('quality')
qualities = video.get('qualities', {})
video_quality = next((k for k, v in qualities.items() if v == video_quality), video_quality)
if video.get('file'):
if video['file'].startswith('uggc'):
video['file'] = codecs.decode(video['file'], 'rot_13')
if video['file'].endswith('adc.mp4'):
video['file'] = video['file'].replace('adc.mp4', '.mp4')
elif not video['file'].startswith('http'):
video['file'] = decrypt_file(video['file'])
video_quality = video.get('quality')
qualities = video.get('qualities', {})
video_quality = next((k for k, v in qualities.items() if v == video_quality), video_quality)
info_dict['formats'].append({
'url': video['file'],
'format_id': video_quality,
@ -310,14 +312,26 @@ def extract_format(page, version):
data = {'jsonrpc': '2.0', 'method': 'videoGetLink', 'id': 2,
'params': [video_id, cda_quality, video.get('ts'), video.get('hash2'), {}]}
data = json.dumps(data).encode()
video_url = self._download_json(
response = self._download_json(
f'https://www.cda.pl/video/{video_id}', video_id, headers={
'Content-Type': 'application/json',
'X-Requested-With': 'XMLHttpRequest',
}, data=data, note=f'Fetching {quality} url',
errnote=f'Failed to fetch {quality} url', fatal=False)
if try_get(video_url, lambda x: x['result']['status']) == 'ok':
video_url = try_get(video_url, lambda x: x['result']['resp'])
if (
traverse_obj(response, ('result', 'status')) != 'ok'
or not traverse_obj(response, ('result', 'resp', {url_or_none}))
):
continue
video_url = response['result']['resp']
ext = determine_ext(video_url)
if ext == 'mpd':
info_dict['formats'].extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
elif ext == 'm3u8':
info_dict['formats'].extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
else:
info_dict['formats'].append({
'url': video_url,
'format_id': quality,
@ -353,7 +367,7 @@ def extract_format(page, version):
class CDAFolderIE(InfoExtractor):
_MAX_PAGE_SIZE = 36
_VALID_URL = r'https?://(?:www\.)?cda\.pl/(?P<channel>\w+)/folder/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?cda\.pl/(?P<channel>[\w-]+)/folder/(?P<id>\d+)'
_TESTS = [
{
'url': 'https://www.cda.pl/domino264/folder/31188385',
@ -378,6 +392,9 @@ class CDAFolderIE(InfoExtractor):
'title': 'TESTY KOSMETYKÓW',
},
'playlist_mincount': 139,
}, {
'url': 'https://www.cda.pl/FILMY-SERIALE-ANIME-KRESKOWKI-BAJKI/folder/18493422',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@ -21,7 +21,7 @@ class CHZZKLiveIE(InfoExtractor):
'channel': '진짜도현',
'channel_id': 'c68b8ef525fb3d2fa146344d84991753',
'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:https?://.+/.+\.jpg',
'timestamp': 1705510344,
'upload_date': '20240117',
'live_status': 'is_live',
@ -98,7 +98,7 @@ class CHZZKVideoIE(InfoExtractor):
'channel': '침착맨',
'channel_id': 'bb382c2c0cc9fa7c86ab3b037fb5799c',
'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 15577,
'timestamp': 1702970505.417,
'upload_date': '20231219',
@ -115,7 +115,7 @@ class CHZZKVideoIE(InfoExtractor):
'channel': '라디유radiyu',
'channel_id': '68f895c59a1043bc5019b5e08c83a5c5',
'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 95,
'timestamp': 1703102631.722,
'upload_date': '20231220',
@ -131,12 +131,30 @@ class CHZZKVideoIE(InfoExtractor):
'channel': '강지',
'channel_id': 'b5ed5db484d04faf4d150aedd362f34b',
'channel_is_verified': True,
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 4433,
'timestamp': 1703307460.214,
'upload_date': '20231223',
'view_count': int,
},
}, {
# video_status == 'NONE' but is downloadable
'url': 'https://chzzk.naver.com/video/6325166',
'info_dict': {
'id': '6325166',
'ext': 'mp4',
'title': '와이프 숙제빼주기',
'channel': '이 다',
'channel_id': '0076a519f147ee9fd0959bf02f9571ca',
'channel_is_verified': False,
'view_count': int,
'duration': 28167,
'thumbnail': r're:https?://.+/.+\.jpg',
'timestamp': 1742139216.86,
'upload_date': '20250316',
'live_status': 'was_live',
},
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
@ -147,11 +165,7 @@ def _real_extract(self, url):
live_status = 'was_live' if video_meta.get('liveOpenDate') else 'not_live'
video_status = video_meta.get('vodStatus')
if video_status == 'UPLOAD':
playback = self._parse_json(video_meta['liveRewindPlaybackJson'], video_id)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
playback['media'][0]['path'], video_id, 'mp4', m3u8_id='hls')
elif video_status == 'ABR_HLS':
if video_status == 'ABR_HLS':
formats, subtitles = self._extract_mpd_formats_and_subtitles(
f'https://apis.naver.com/neonplayer/vodplay/v1/playback/{video_meta["videoId"]}',
video_id, query={
@ -161,6 +175,13 @@ def _real_extract(self, url):
'cpl': 'en_US',
})
else:
fatal = video_status == 'UPLOAD'
playback = self._parse_json(video_meta['liveRewindPlaybackJson'], video_id, fatal=fatal)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
traverse_obj(playback, ('media', 0, 'path')), video_id, 'mp4', m3u8_id='hls', fatal=fatal)
if formats and video_status != 'UPLOAD':
self.write_debug(f'Video found with status: "{video_status}"')
elif not formats:
self.raise_no_formats(
f'Unknown video status detected: "{video_status}"', expected=True, video_id=video_id)
formats, subtitles = [], {}

View File

@ -78,6 +78,7 @@
parse_iso8601,
parse_m3u8_attributes,
parse_resolution,
qualities,
sanitize_url,
smuggle_url,
str_or_none,
@ -1569,6 +1570,8 @@ def _yield_json_ld(self, html, video_id, *, fatal=True, default=NO_DEFAULT):
"""Yield all json ld objects in the html"""
if default is not NO_DEFAULT:
fatal = False
if not fatal and not isinstance(html, str):
return
for mobj in re.finditer(JSON_LD_RE, html):
json_ld_item = self._parse_json(
mobj.group('json_ld'), video_id, fatal=fatal,
@ -2177,6 +2180,8 @@ def extract_media(x_media_line):
media_url = media.get('URI')
if media_url:
manifest_url = format_url(media_url)
is_audio = media_type == 'AUDIO'
is_alternate = media.get('DEFAULT') == 'NO' or media.get('AUTOSELECT') == 'NO'
formats.extend({
'format_id': join_nonempty(m3u8_id, group_id, name, idx),
'format_note': name,
@ -2189,7 +2194,11 @@ def extract_media(x_media_line):
'preference': preference,
'quality': quality,
'has_drm': has_drm,
'vcodec': 'none' if media_type == 'AUDIO' else None,
'vcodec': 'none' if is_audio else None,
# Alternate audio formats (e.g. audio description) should be deprioritized
'source_preference': -2 if is_audio and is_alternate else None,
# Save this to assign source_preference based on associated video stream
'_audio_group_id': group_id if is_audio and not is_alternate else None,
} for idx in _extract_m3u8_playlist_indices(manifest_url))
def build_stream_name():
@ -2284,6 +2293,8 @@ def build_stream_name():
# ignore references to rendition groups and treat them
# as complete formats.
if audio_group_id and codecs and f.get('vcodec') != 'none':
# Save this to determine quality of audio formats that only have a GROUP-ID
f['_audio_group_id'] = audio_group_id
audio_group = groups.get(audio_group_id)
if audio_group and audio_group[0].get('URI'):
# TODO: update acodec for audio only formats with
@ -2306,6 +2317,28 @@ def build_stream_name():
formats.append(http_f)
last_stream_inf = {}
# Some audio-only formats only have a GROUP-ID without any other quality/bitrate/codec info
# Each audio GROUP-ID corresponds with one or more video formats' AUDIO attribute
# For sorting purposes, set source_preference based on the quality of the video formats they are grouped with
# See https://github.com/yt-dlp/yt-dlp/issues/11178
audio_groups_by_quality = orderedSet(f['_audio_group_id'] for f in sorted(
traverse_obj(formats, lambda _, v: v.get('vcodec') != 'none' and v['_audio_group_id']),
key=lambda x: (x.get('tbr') or 0, x.get('width') or 0)))
audio_quality_map = {
audio_groups_by_quality[0]: 'low',
audio_groups_by_quality[-1]: 'high',
} if len(audio_groups_by_quality) > 1 else None
audio_preference = qualities(audio_groups_by_quality)
for fmt in formats:
audio_group_id = fmt.pop('_audio_group_id', None)
if not audio_quality_map or not audio_group_id or fmt.get('vcodec') != 'none':
continue
# Use source_preference since quality and preference are set by params
fmt['source_preference'] = audio_preference(audio_group_id)
fmt['format_note'] = join_nonempty(
fmt.get('format_note'), audio_quality_map.get(audio_group_id), delim=', ')
return formats, subtitles
def _extract_m3u8_vod_duration(
@ -2935,8 +2968,7 @@ def location_key(location):
segment_duration = None
if 'total_number' not in representation_ms_info and 'segment_duration' in representation_ms_info:
segment_duration = float_or_none(representation_ms_info['segment_duration'], representation_ms_info['timescale'])
representation_ms_info['total_number'] = int(math.ceil(
float_or_none(period_duration, segment_duration, default=0)))
representation_ms_info['total_number'] = math.ceil(float_or_none(period_duration, segment_duration, default=0))
representation_ms_info['fragments'] = [{
media_location_key: media_template % {
'Number': segment_number,

View File

@ -5,7 +5,9 @@
int_or_none,
try_get,
unified_strdate,
url_or_none,
)
from ..utils.traversal import traverse_obj
class CrowdBunkerIE(InfoExtractor):
@ -44,16 +46,15 @@ def _real_extract(self, url):
'url': sub_url,
})
mpd_url = try_get(video_json, lambda x: x['dashManifest']['url'])
if mpd_url:
fmts, subs = self._extract_mpd_formats_and_subtitles(mpd_url, video_id)
if mpd_url := traverse_obj(video_json, ('dashManifest', 'url', {url_or_none})):
fmts, subs = self._extract_mpd_formats_and_subtitles(mpd_url, video_id, mpd_id='dash', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
m3u8_url = try_get(video_json, lambda x: x['hlsManifest']['url'])
if m3u8_url:
fmts, subs = self._extract_m3u8_formats_and_subtitles(mpd_url, video_id)
self._merge_subtitles(subs, target=subtitles)
if m3u8_url := traverse_obj(video_json, ('hlsManifest', 'url', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, m3u8_id='hls', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
self._merge_subtitles(subs, target=subtitles)
thumbnails = [{
'url': image['url'],

View File

@ -9,6 +9,7 @@
ExtractorError,
classproperty,
float_or_none,
parse_qs,
traverse_obj,
url_or_none,
)
@ -91,11 +92,15 @@ def _usp_signing_secret(self):
# Rotates every so often, but hardcode a fallback in case of JS change/breakage before rotation
return self._search_regex(
r'\bUSP_SIGNING_SECRET\s*=\s*(["\'])(?P<secret>(?:(?!\1).)+)', player_js,
'usp signing secret', group='secret', fatal=False) or 'odnInCGqhvtyRTtIiddxtuRtawYYICZP'
'usp signing secret', group='secret', fatal=False) or 'hGDtqMKYVeFdofrAfFmBcrsakaZELajI'
def _real_extract(self, url):
user_id, video_id = self._match_valid_url(url).group('user_id', 'id')
query = {'contentId': f'{user_id}-vod-{video_id}', 'provider': 'universe'}
query = {
'contentId': f'{user_id}-vod-{video_id}',
'provider': 'universe',
**traverse_obj(url, ({parse_qs}, 'uss_token', {'signedKey': -1})),
}
info = self._download_json(self._API_INFO_URL, video_id, query=query, fatal=False)
access = self._download_json(
'https://playback.dacast.com/content/access', video_id,

View File

@ -1,142 +0,0 @@
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
orderedSet,
)
class DeezerBaseInfoExtractor(InfoExtractor):
def get_data(self, url):
if not self.get_param('test'):
self.report_warning('For now, this extractor only supports the 30 second previews. Patches welcome!')
mobj = self._match_valid_url(url)
data_id = mobj.group('id')
webpage = self._download_webpage(url, data_id)
geoblocking_msg = self._html_search_regex(
r'<p class="soon-txt">(.*?)</p>', webpage, 'geoblocking message',
default=None)
if geoblocking_msg is not None:
raise ExtractorError(
f'Deezer said: {geoblocking_msg}', expected=True)
data_json = self._search_regex(
(r'__DZR_APP_STATE__\s*=\s*({.+?})\s*</script>',
r'naboo\.display\(\'[^\']+\',\s*(.*?)\);\n'),
webpage, 'data JSON')
data = json.loads(data_json)
return data_id, webpage, data
class DeezerPlaylistIE(DeezerBaseInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?deezer\.com/(../)?playlist/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.deezer.com/playlist/176747451',
'info_dict': {
'id': '176747451',
'title': 'Best!',
'uploader': 'anonymous',
'thumbnail': r're:^https?://(e-)?cdns-images\.dzcdn\.net/images/cover/.*\.jpg$',
},
'playlist_count': 29,
}
def _real_extract(self, url):
playlist_id, webpage, data = self.get_data(url)
playlist_title = data.get('DATA', {}).get('TITLE')
playlist_uploader = data.get('DATA', {}).get('PARENT_USERNAME')
playlist_thumbnail = self._search_regex(
r'<img id="naboo_playlist_image".*?src="([^"]+)"', webpage,
'playlist thumbnail')
entries = []
for s in data.get('SONGS', {}).get('data'):
formats = [{
'format_id': 'preview',
'url': s.get('MEDIA', [{}])[0].get('HREF'),
'preference': -100, # Only the first 30 seconds
'ext': 'mp3',
}]
artists = ', '.join(
orderedSet(a.get('ART_NAME') for a in s.get('ARTISTS')))
entries.append({
'id': s.get('SNG_ID'),
'duration': int_or_none(s.get('DURATION')),
'title': '{} - {}'.format(artists, s.get('SNG_TITLE')),
'uploader': s.get('ART_NAME'),
'uploader_id': s.get('ART_ID'),
'age_limit': 16 if s.get('EXPLICIT_LYRICS') == '1' else 0,
'formats': formats,
})
return {
'_type': 'playlist',
'id': playlist_id,
'title': playlist_title,
'uploader': playlist_uploader,
'thumbnail': playlist_thumbnail,
'entries': entries,
}
class DeezerAlbumIE(DeezerBaseInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?deezer\.com/(../)?album/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://www.deezer.com/fr/album/67505622',
'info_dict': {
'id': '67505622',
'title': 'Last Week',
'uploader': 'Home Brew',
'thumbnail': r're:^https?://(e-)?cdns-images\.dzcdn\.net/images/cover/.*\.jpg$',
},
'playlist_count': 7,
}
def _real_extract(self, url):
album_id, webpage, data = self.get_data(url)
album_title = data.get('DATA', {}).get('ALB_TITLE')
album_uploader = data.get('DATA', {}).get('ART_NAME')
album_thumbnail = self._search_regex(
r'<img id="naboo_album_image".*?src="([^"]+)"', webpage,
'album thumbnail')
entries = []
for s in data.get('SONGS', {}).get('data'):
formats = [{
'format_id': 'preview',
'url': s.get('MEDIA', [{}])[0].get('HREF'),
'preference': -100, # Only the first 30 seconds
'ext': 'mp3',
}]
artists = ', '.join(
orderedSet(a.get('ART_NAME') for a in s.get('ARTISTS')))
entries.append({
'id': s.get('SNG_ID'),
'duration': int_or_none(s.get('DURATION')),
'title': '{} - {}'.format(artists, s.get('SNG_TITLE')),
'uploader': s.get('ART_NAME'),
'uploader_id': s.get('ART_ID'),
'age_limit': 16 if s.get('EXPLICIT_LYRICS') == '1' else 0,
'formats': formats,
'track': s.get('SNG_TITLE'),
'track_number': int_or_none(s.get('TRACK_NUMBER')),
'track_id': s.get('SNG_ID'),
'artist': album_uploader,
'album': album_title,
'album_artist': album_uploader,
})
return {
'_type': 'playlist',
'id': album_id,
'title': album_title,
'uploader': album_uploader,
'thumbnail': album_thumbnail,
'entries': entries,
}

View File

@ -1,9 +1,15 @@
from .zdf import ZDFBaseIE
from ..utils import (
int_or_none,
merge_dicts,
parse_iso8601,
)
from ..utils.traversal import require, traverse_obj
class DreiSatIE(ZDFBaseIE):
IE_NAME = '3sat'
_VALID_URL = r'https?://(?:www\.)?3sat\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html'
_VALID_URL = r'https?://(?:www\.)?3sat\.de/(?:[^/?#]+/)*(?P<id>[^/?#&]+)\.html'
_TESTS = [{
'url': 'https://www.3sat.de/dokumentation/reise/traumziele-suedostasiens-die-philippinen-und-vietnam-102.html',
'info_dict': {
@ -12,40 +18,59 @@ class DreiSatIE(ZDFBaseIE):
'title': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam',
'description': 'md5:26329ce5197775b596773b939354079d',
'duration': 2625.0,
'thumbnail': 'https://www.3sat.de/assets/traumziele-suedostasiens-die-philippinen-und-vietnam-100~2400x1350?cb=1699870351148',
'thumbnail': 'https://www.3sat.de/assets/traumziele-suedostasiens-die-philippinen-und-vietnam-100~original?cb=1699870351148',
'episode': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam',
'episode_id': 'POS_cc7ff51c-98cf-4d12-b99d-f7a551de1c95',
'timestamp': 1738593000,
'upload_date': '20250203',
'timestamp': 1747920900,
'upload_date': '20250522',
},
}, {
# Same as https://www.zdf.de/dokumentation/ab-18/10-wochen-sommer-102.html
'url': 'https://www.3sat.de/film/ab-18/10-wochen-sommer-108.html',
'md5': '0aff3e7bc72c8813f5e0fae333316a1d',
'url': 'https://www.3sat.de/film/ab-18/ab-18---mein-fremdes-ich-100.html',
'md5': 'f92638413a11d759bdae95c9d8ec165c',
'info_dict': {
'id': '141007_ab18_10wochensommer_film',
'id': '221128_mein_fremdes_ich2_ab18',
'ext': 'mp4',
'title': 'Ab 18! - 10 Wochen Sommer',
'description': 'md5:8253f41dc99ce2c3ff892dac2d65fe26',
'duration': 2660,
'timestamp': 1608604200,
'upload_date': '20201222',
'title': 'Ab 18! - Mein fremdes Ich',
'description': 'md5:cae0c0b27b7426d62ca0dda181738bf0',
'duration': 2625.0,
'thumbnail': 'https://www.3sat.de/assets/ab-18---mein-fremdes-ich-106~original?cb=1666081865812',
'episode': 'Ab 18! - Mein fremdes Ich',
'episode_id': 'POS_6225d1ca-a0d5-45e3-870b-e783ee6c8a3f',
'timestamp': 1695081600,
'upload_date': '20230919',
},
'skip': '410 Gone',
}, {
'url': 'https://www.3sat.de/gesellschaft/schweizweit/waidmannsheil-100.html',
'url': 'https://www.3sat.de/gesellschaft/37-grad-leben/aus-dem-leben-gerissen-102.html',
'md5': 'a903eaf8d1fd635bd3317cd2ad87ec84',
'info_dict': {
'id': '140913_sendung_schweizweit',
'id': '250323_0903_sendung_sgl',
'ext': 'mp4',
'title': 'Waidmannsheil',
'description': 'md5:cce00ca1d70e21425e72c86a98a56817',
'timestamp': 1410623100,
'upload_date': '20140913',
'title': 'Plötzlich ohne dich',
'description': 'md5:380cc10659289dd91510ad8fa717c66b',
'duration': 1620.0,
'thumbnail': 'https://www.3sat.de/assets/37-grad-leben-106~original?cb=1645537156810',
'episode': 'Plötzlich ohne dich',
'episode_id': 'POS_faa7a93c-c0f2-4d51-823f-ce2ac3ee191b',
'timestamp': 1743162540,
'upload_date': '20250328',
},
'params': {
'skip_download': True,
}, {
# Video with chapters
'url': 'https://www.3sat.de/kultur/buchmesse/dein-buch-das-beste-von-der-leipziger-buchmesse-2025-teil-1-100.html',
'md5': '6b95790ce52e75f0d050adcdd2711ee6',
'info_dict': {
'id': '250330_dein_buch1_bum',
'ext': 'mp4',
'title': 'dein buch - Das Beste von der Leipziger Buchmesse 2025 - Teil 1',
'description': 'md5:bae51bfc22f15563ce3acbf97d2e8844',
'duration': 5399.0,
'thumbnail': 'https://www.3sat.de/assets/buchmesse-kerkeling-100~original?cb=1743329640903',
'chapters': 'count:24',
'episode': 'dein buch - Das Beste von der Leipziger Buchmesse 2025 - Teil 1',
'episode_id': 'POS_1ef236cc-b390-401e-acd0-4fb4b04315fb',
'timestamp': 1743327000,
'upload_date': '20250330',
},
'skip': '404 Not Found',
}, {
# Same as https://www.zdf.de/filme/filme-sonstige/der-hauptmann-112.html
'url': 'https://www.3sat.de/film/spielfilm/der-hauptmann-100.html',
@ -58,11 +83,42 @@ class DreiSatIE(ZDFBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player = self._search_json(
r'data-zdfplayer-jsb=(["\'])', webpage, 'player JSON', video_id)
player_url = player['content']
api_token = f'Bearer {player["apiToken"]}'
webpage = self._download_webpage(url, video_id, fatal=False)
if webpage:
player = self._extract_player(webpage, url, fatal=False)
if player:
return self._extract_regular(url, player, video_id)
content = self._call_api(player_url, video_id, 'video metadata', api_token)
return self._extract_mobile(video_id)
video_target = content['mainVideoContent']['http://zdf.de/rels/target']
ptmd_path = traverse_obj(video_target, (
(('streams', 'default'), None),
('http://zdf.de/rels/streams/ptmd', 'http://zdf.de/rels/streams/ptmd-template'),
{str}, any, {require('ptmd path')}))
ptmd_url = self._expand_ptmd_template(player_url, ptmd_path)
aspect_ratio = self._parse_aspect_ratio(video_target.get('aspectRatio'))
info = self._extract_ptmd(ptmd_url, video_id, api_token, aspect_ratio)
return merge_dicts(info, {
**traverse_obj(content, {
'title': (('title', 'teaserHeadline'), {str}, any),
'episode': (('title', 'teaserHeadline'), {str}, any),
'description': (('leadParagraph', 'teasertext'), {str}, any),
'timestamp': ('editorialDate', {parse_iso8601}),
}),
**traverse_obj(video_target, {
'duration': ('duration', {int_or_none}),
'chapters': ('streamAnchorTag', {self._extract_chapters}),
}),
'thumbnails': self._extract_thumbnails(traverse_obj(content, ('teaserImageRef', 'layouts', {dict}))),
**traverse_obj(content, ('programmeItem', 0, 'http://zdf.de/rels/target', {
'series_id': ('http://zdf.de/rels/cmdm/series', 'seriesUuid', {str}),
'series': ('http://zdf.de/rels/cmdm/series', 'seriesTitle', {str}),
'season': ('http://zdf.de/rels/cmdm/season', 'seasonTitle', {str}),
'season_number': ('http://zdf.de/rels/cmdm/season', 'seasonNumber', {int_or_none}),
'season_id': ('http://zdf.de/rels/cmdm/season', 'seasonUuid', {str}),
'episode_number': ('episodeNumber', {int_or_none}),
'episode_id': ('contentId', {str}),
})),
})

View File

@ -5,7 +5,6 @@
from .adobepass import AdobePassIE
from .common import InfoExtractor
from .once import OnceIE
from ..utils import (
determine_ext,
dict_get,
@ -16,7 +15,7 @@
)
class ESPNIE(OnceIE):
class ESPNIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
@ -131,9 +130,7 @@ def extract_source(source_url, source_id=None):
return
format_urls.add(source_url)
ext = determine_ext(source_url)
if OnceIE.suitable(source_url):
formats.extend(self._extract_once_formats(source_url))
elif ext == 'smil':
if ext == 'smil':
formats.extend(self._extract_smil_formats(
source_url, video_id, fatal=False))
elif ext == 'f4m':
@ -332,6 +329,7 @@ class WatchESPNIE(AdobePassIE):
}]
_API_KEY = 'ZXNwbiZicm93c2VyJjEuMC4w.ptUt7QxsteaRruuPmGZFaJByOoqKvDP2a5YkInHrc7c'
_SOFTWARE_STATEMENT = 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiIyZGJmZWM4My03OWE1LTQyNzEtYTVmZC04NTZjYTMxMjRjNjMiLCJuYmYiOjE1NDAyMTI3NjEsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTQwMjEyNzYxfQ.yaK3r4AI2uLVvsyN1GLzqzgzRlxMPtasSaiYYBV0wIstqih5tvjTmeoLmi8Xy9Kp_U7Md-bOffwiyK3srHkpUkhhwXLH2x6RPjmS1tPmhaG7-3LBcHTf2ySPvXhVf7cN4ngldawK4tdtLtsw6rF_JoZE2yaC6XbS2F51nXSFEDDnOQWIHEQRG3aYAj-38P2CLGf7g-Yfhbp5cKXeksHHQ90u3eOO4WH0EAjc9oO47h33U8KMEXxJbvjV5J8Va2G2fQSgLDZ013NBI3kQnE313qgqQh2feQILkyCENpB7g-TVBreAjOaH1fU471htSoGGYepcAXv-UDtpgitDiLy7CQ'
def _call_bamgrid_api(self, path, video_id, payload=None, headers={}):
if 'Authorization' not in headers:
@ -408,8 +406,8 @@ def _real_extract(self, url):
# TV Provider required
else:
resource = self._get_mvpd_resource('ESPN', video_data['name'], video_id, None)
auth = self._extract_mvpd_auth(url, video_id, 'ESPN', resource).encode()
resource = self._get_mvpd_resource('espn1', video_data['name'], video_id, None)
auth = self._extract_mvpd_auth(url, video_id, 'ESPN', resource, self._SOFTWARE_STATEMENT).encode()
asset = self._download_json(
f'https://watch.auth.api.espn.com/video/auth/media/{video_id}/asset?apikey=uiqlbgzdwuru14v627vdusswb',

View File

@ -2,11 +2,15 @@
from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
qualities,
join_nonempty,
mimetype2ext,
parse_qs,
unified_strdate,
url_or_none,
)
from ..utils.traversal import traverse_obj
class FirstTVIE(InfoExtractor):
@ -15,40 +19,51 @@ class FirstTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:sport)?1tv\.ru/(?:[^/?#]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
# single format
'url': 'http://www.1tv.ru/shows/naedine-so-vsemi/vypuski/gost-lyudmila-senchina-naedine-so-vsemi-vypusk-ot-12-02-2015',
'md5': 'a1b6b60d530ebcf8daacf4565762bbaf',
# single format; has item.id
'url': 'https://www.1tv.ru/shows/naedine-so-vsemi/vypuski/gost-lyudmila-senchina-naedine-so-vsemi-vypusk-ot-12-02-2015',
'md5': '8011ae8e88ff4150107ab9c5a8f5b659',
'info_dict': {
'id': '40049',
'ext': 'mp4',
'title': 'Гость Людмила Сенчина. Наедине со всеми. Выпуск от 12.02.2015',
'thumbnail': r're:^https?://.*\.(?:jpg|JPG)$',
'thumbnail': r're:https?://.+/.+\.jpg',
'upload_date': '20150212',
'duration': 2694,
},
'params': {'skip_download': 'm3u8'},
}, {
# multiple formats
'url': 'http://www.1tv.ru/shows/dobroe-utro/pro-zdorove/vesennyaya-allergiya-dobroe-utro-fragment-vypuska-ot-07042016',
# multiple formats; has item.id
'url': 'https://www.1tv.ru/shows/dobroe-utro/pro-zdorove/vesennyaya-allergiya-dobroe-utro-fragment-vypuska-ot-07042016',
'info_dict': {
'id': '364746',
'ext': 'mp4',
'title': 'Весенняя аллергия. Доброе утро. Фрагмент выпуска от 07.04.2016',
'thumbnail': r're:^https?://.*\.(?:jpg|JPG)$',
'thumbnail': r're:https?://.+/.+\.jpg',
'upload_date': '20160407',
'duration': 179,
'formats': 'mincount:3',
},
'params': {
'skip_download': True,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://www.1tv.ru/news/issue/2016-12-01/14:00',
'url': 'https://www.1tv.ru/news/issue/2016-12-01/14:00',
'info_dict': {
'id': '14:00',
'title': 'Выпуск новостей в 14:00 1 декабря 2016 года. Новости. Первый канал',
'description': 'md5:2e921b948f8c1ff93901da78ebdb1dfd',
'title': 'Выпуск программы «Время» в 20:00 1 декабря 2016 года. Новости. Первый канал',
'thumbnail': 'https://static.1tv.ru/uploads/photo/image/8/big/338448_big_8fc7eb236f.jpg',
},
'playlist_count': 13,
}, {
# has timestamp; has item.uid but not item.id
'url': 'https://www.1tv.ru/shows/segodnya-vecherom/vypuski/avtory-odnogo-hita-segodnya-vecherom-vypusk-ot-03-05-2025',
'info_dict': {
'id': '270411',
'ext': 'mp4',
'title': 'Авторы одного хита. Сегодня вечером. Выпуск от 03.05.2025',
'thumbnail': r're:https?://.+/.+\.jpg',
'timestamp': 1746286020,
'upload_date': '20250503',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://www.1tv.ru/shows/tochvtoch-supersezon/vystupleniya/evgeniy-dyatlov-vladimir-vysockiy-koni-priveredlivye-toch-v-toch-supersezon-fragment-vypuska-ot-06-11-2016',
'only_matching': True,
@ -57,96 +72,60 @@ class FirstTVIE(InfoExtractor):
'only_matching': True,
}]
def _entries(self, items):
for item in items:
video_id = str(item.get('id') or item['uid'])
formats, subtitles = [], {}
for f in traverse_obj(item, ('sources', lambda _, v: url_or_none(v['src']))):
src = f['src']
ext = mimetype2ext(f.get('type'), default=determine_ext(src))
if ext == 'm3u8':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
src, video_id, 'mp4', m3u8_id='hls', fatal=False)
elif ext == 'mpd':
fmts, subs = self._extract_mpd_formats_and_subtitles(
src, video_id, mpd_id='dash', fatal=False)
else:
tbr = self._search_regex(fr'_(\d{{3,}})\.{ext}', src, 'tbr', default=None)
formats.append({
'url': src,
'ext': ext,
'format_id': join_nonempty('http', ext, tbr),
'tbr': int_or_none(tbr),
# quality metadata of http formats may be incorrect
'quality': -10,
})
continue
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
yield {
**traverse_obj(item, {
'title': ('title', {str}),
'thumbnail': ('poster', {url_or_none}),
'timestamp': ('dvr_begin_at', {int_or_none}),
'upload_date': ('date_air', {unified_strdate}),
'duration': ('duration', {int_or_none}),
}),
'id': video_id,
'formats': formats,
'subtitles': subtitles,
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
playlist_url = urllib.parse.urljoin(url, self._search_regex(
playlist_url = urllib.parse.urljoin(url, self._html_search_regex(
r'data-playlist-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'playlist url', group='url'))
parsed_url = urllib.parse.urlparse(playlist_url)
qs = urllib.parse.parse_qs(parsed_url.query)
item_ids = qs.get('videos_ids[]') or qs.get('news_ids[]')
item_ids = traverse_obj(parse_qs(playlist_url), 'video_id', 'videos_ids[]', 'news_ids[]')
items = traverse_obj(
self._download_json(playlist_url, display_id),
lambda _, v: v['uid'] and (str(v['uid']) in item_ids if item_ids else True))
items = self._download_json(playlist_url, display_id)
if item_ids:
items = [
item for item in items
if item.get('uid') and str(item['uid']) in item_ids]
else:
items = [items[0]]
entries = []
QUALITIES = ('ld', 'sd', 'hd')
for item in items:
title = item['title']
quality = qualities(QUALITIES)
formats = []
path = None
for f in item.get('mbr', []):
src = url_or_none(f.get('src'))
if not src:
continue
tbr = int_or_none(self._search_regex(
r'_(\d{3,})\.mp4', src, 'tbr', default=None))
if not path:
path = self._search_regex(
r'//[^/]+/(.+?)_\d+\.mp4', src,
'm3u8 path', default=None)
formats.append({
'url': src,
'format_id': f.get('name'),
'tbr': tbr,
'source_preference': quality(f.get('name')),
# quality metadata of http formats may be incorrect
'preference': -10,
})
# m3u8 URL format is reverse engineered from [1] (search for
# master.m3u8). dashEdges (that is currently balancer-vod.1tv.ru)
# is taken from [2].
# 1. http://static.1tv.ru/player/eump1tv-current/eump-1tv.all.min.js?rnd=9097422834:formatted
# 2. http://static.1tv.ru/player/eump1tv-config/config-main.js?rnd=9097422834
if not path and len(formats) == 1:
path = self._search_regex(
r'//[^/]+/(.+?$)', formats[0]['url'],
'm3u8 path', default=None)
if path:
if len(formats) == 1:
m3u8_path = ','
else:
tbrs = [str(t) for t in sorted(f['tbr'] for f in formats)]
m3u8_path = '_,{},{}'.format(','.join(tbrs), '.mp4')
formats.extend(self._extract_m3u8_formats(
f'http://balancer-vod.1tv.ru/{path}{m3u8_path}.urlset/master.m3u8',
display_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
thumbnail = item.get('poster') or self._og_search_thumbnail(webpage)
duration = int_or_none(item.get('duration') or self._html_search_meta(
'video:duration', webpage, 'video duration', fatal=False))
upload_date = unified_strdate(self._html_search_meta(
'ya:ovs:upload_date', webpage, 'upload date', default=None))
entries.append({
'id': str(item.get('id') or item['uid']),
'thumbnail': thumbnail,
'title': title,
'upload_date': upload_date,
'duration': int_or_none(duration),
'formats': formats,
})
title = self._html_search_regex(
(r'<div class="tv_translation">\s*<h1><a href="[^"]+">([^<]*)</a>',
r"'title'\s*:\s*'([^']+)'"),
webpage, 'title', default=None) or self._og_search_title(
webpage, default=None)
description = self._html_search_regex(
r'<div class="descr">\s*<div>&nbsp;</div>\s*<p>([^<]*)</p></div>',
webpage, 'description', default=None) or self._html_search_meta(
'description', webpage, 'description', default=None)
return self.playlist_result(entries, display_id, title, description)
return self.playlist_result(
self._entries(items), display_id, self._og_search_title(webpage, default=None),
thumbnail=self._og_search_thumbnail(webpage, default=None))

View File

@ -0,0 +1,87 @@
import urllib.parse
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
float_or_none,
url_or_none,
)
from ..utils.traversal import traverse_obj
class FrancaisFacileIE(InfoExtractor):
_VALID_URL = r'https?://francaisfacile\.rfi\.fr/[a-z]{2}/(?:actualit%C3%A9|podcasts/[^/#?]+)/(?P<id>[^/#?]+)'
_TESTS = [{
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250305-r%C3%A9concilier-les-jeunes-avec-la-lecture-gr%C3%A2ce-aux-r%C3%A9seaux-sociaux',
'md5': '4f33674cb205744345cc835991100afa',
'info_dict': {
'id': 'WBMZ58952-FLE-FR-20250305',
'display_id': '20250305-réconcilier-les-jeunes-avec-la-lecture-grâce-aux-réseaux-sociaux',
'title': 'Réconcilier les jeunes avec la lecture grâce aux réseaux sociaux',
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/05/6b6af52a-f9ba-11ef-a1f8-005056a97652.mp3',
'ext': 'mp3',
'description': 'md5:b903c63d8585bd59e8cc4d5f80c4272d',
'duration': 103.15,
'timestamp': 1741177984,
'upload_date': '20250305',
},
}, {
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250307-argentine-le-sac-d-un-alpiniste-retrouv%C3%A9-40-ans-apr%C3%A8s-sa-mort',
'md5': 'b8c3a63652d4ae8e8092dda5700c1cd9',
'info_dict': {
'id': 'WBMZ59102-FLE-FR-20250307',
'display_id': '20250307-argentine-le-sac-d-un-alpiniste-retrouvé-40-ans-après-sa-mort',
'title': 'Argentine: le sac d\'un alpiniste retrouvé 40 ans après sa mort',
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/07/8edf4082-fb46-11ef-8a37-005056bf762b.mp3',
'ext': 'mp3',
'description': 'md5:7fd088fbdf4a943bb68cf82462160dca',
'duration': 117.74,
'timestamp': 1741352789,
'upload_date': '20250307',
},
}, {
'url': 'https://francaisfacile.rfi.fr/fr/podcasts/un-mot-une-histoire/20250317-le-mot-de-david-foenkinos-peut-%C3%AAtre',
'md5': 'db83c2cc2589b4c24571c6b6cf14f5f1',
'info_dict': {
'id': 'WBMZ59441-FLE-FR-20250317',
'display_id': '20250317-le-mot-de-david-foenkinos-peut-être',
'title': 'Le mot de David Foenkinos: «peut-être» - Un mot, une histoire',
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/17/4ca6cbbe-0315-11f0-a85b-005056a97652.mp3',
'ext': 'mp3',
'description': 'md5:3fe35fae035803df696bfa7af2496e49',
'duration': 198.96,
'timestamp': 1742210897,
'upload_date': '20250317',
},
}]
def _real_extract(self, url):
display_id = urllib.parse.unquote(self._match_id(url))
try: # yt-dlp's default user-agents are too old and blocked by the site
webpage = self._download_webpage(url, display_id, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
})
except ExtractorError as e:
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
raise
# Retry with impersonation if hardcoded UA is insufficient
webpage = self._download_webpage(url, display_id, impersonate=True)
data = self._search_json(
r'<script[^>]+\bdata-media-id=[^>]+\btype="application/json"[^>]*>',
webpage, 'audio data', display_id)
return {
'id': data['mediaId'],
'display_id': display_id,
'vcodec': 'none',
'title': self._html_extract_title(webpage),
**self._search_json_ld(webpage, display_id, fatal=False),
**traverse_obj(data, {
'title': ('title', {str}),
'url': ('sources', ..., 'url', {url_or_none}, any),
'duration': ('sources', ..., 'duration', {float_or_none}, any),
}),
}

View File

@ -1,9 +1,9 @@
import urllib.parse
from .once import OnceIE
from .common import InfoExtractor
class GameSpotIE(OnceIE):
class GameSpotIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?gamespot\.com/(?:video|article|review)s/(?:[^/]+/\d+-|embed/)(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.gamespot.com/videos/arma-3-community-guide-sitrep-i/2300-6410818/',

View File

@ -37,6 +37,7 @@
unescapeHTML,
unified_timestamp,
unsmuggle_url,
update_url,
update_url_query,
url_or_none,
urlhandle_detect_ext,
@ -2213,10 +2214,21 @@ def hex_or_none(value):
if is_live is not None:
info['live_status'] = 'not_live' if is_live == 'false' else 'is_live'
return
headers = m3u8_format.get('http_headers') or info.get('http_headers')
duration = self._extract_m3u8_vod_duration(
m3u8_format['url'], info.get('id'), note='Checking m3u8 live status',
errnote='Failed to download m3u8 media playlist', headers=headers)
headers = m3u8_format.get('http_headers') or info.get('http_headers') or {}
display_id = info.get('id')
urlh = self._request_webpage(
m3u8_format['url'], display_id, 'Checking m3u8 live status', errnote=False,
headers={**headers, 'Accept-Encoding': 'identity'}, fatal=False)
if urlh is False:
return
first_bytes = urlh.read(512)
if not first_bytes.startswith(b'#EXTM3U'):
return
m3u8_doc = self._webpage_read_content(
urlh, urlh.url, display_id, prefix=first_bytes, fatal=False, errnote=False)
if not m3u8_doc:
return
duration = self._parse_m3u8_vod_duration(m3u8_doc, display_id)
if not duration:
info['live_status'] = 'is_live'
info['duration'] = info.get('duration') or duration
@ -2526,12 +2538,13 @@ def _real_extract(self, url):
return self.playlist_result(
self._parse_xspf(
doc, video_id, xspf_url=url,
xspf_base_url=full_response.url),
xspf_base_url=new_url),
video_id)
elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag):
info_dict['formats'], info_dict['subtitles'] = self._parse_mpd_formats_and_subtitles(
doc,
mpd_base_url=full_response.url.rpartition('/')[0],
# Do not use yt_dlp.utils.base_url here since it will raise on file:// URLs
mpd_base_url=update_url(new_url, query=None, fragment=None).rpartition('/')[0],
mpd_url=url)
info_dict['live_status'] = 'is_live' if doc.get('type') == 'dynamic' else None
self._extra_manifest_info(info_dict, url)

View File

@ -8,7 +8,7 @@
class GetCourseRuPlayerIE(InfoExtractor):
_VALID_URL = r'https?://player02\.getcourse\.ru/sign-player/?\?(?:[^#]+&)?json=[^#&]+'
_VALID_URL = r'https?://(?:player02\.getcourse\.ru|cf-api-2\.vhcdn\.com)/sign-player/?\?(?:[^#]+&)?json=[^#&]+'
_EMBED_REGEX = [rf'<iframe[^>]+\bsrc=[\'"](?P<url>{_VALID_URL}[^\'"]*)']
_TESTS = [{
'url': 'http://player02.getcourse.ru/sign-player/?json=eyJ2aWRlb19oYXNoIjoiMTkwYmRmOTNmMWIyOTczNTMwOTg1M2E3YTE5ZTI0YjMiLCJ1c2VyX2lkIjozNTk1MjUxODMsInN1Yl9sb2dpbl91c2VyX2lkIjpudWxsLCJsZXNzb25faWQiOm51bGwsImlwIjoiNDYuMTQyLjE4Mi4yNDciLCJnY19ob3N0IjoiYWNhZGVteW1lbC5vbmxpbmUiLCJ0aW1lIjoxNzA1NDQ5NjQyLCJwYXlsb2FkIjoidV8zNTk1MjUxODMiLCJ1aV9sYW5ndWFnZSI6InJ1IiwiaXNfaGF2ZV9jdXN0b21fc3R5bGUiOnRydWV9&s=354ad2c993d95d5ac629e3133d6cefea&vh-static-feature=zigzag',
@ -20,6 +20,16 @@ class GetCourseRuPlayerIE(InfoExtractor):
'duration': 1693,
},
'skip': 'JWT expired',
}, {
'url': 'https://cf-api-2.vhcdn.com/sign-player/?json=example',
'info_dict': {
'id': '435735291',
'title': '8afd7c489952108e00f019590f3711f3',
'ext': 'mp4',
'thumbnail': 'https://preview-htz.vhcdn.com/preview/8afd7c489952108e00f019590f3711f3/preview.jpg?version=1682170973&host=vh-72',
'duration': 777,
},
'skip': 'JWT expired',
}]
def _real_extract(self, url):
@ -168,7 +178,7 @@ def _real_extract(self, url):
playlist_id = self._search_regex(
r'window\.(?:lessonId|gcsObjectId)\s*=\s*(\d+)', webpage, 'playlist id', default=display_id)
title = self._og_search_title(webpage) or self._html_extract_title(webpage)
title = self._og_search_title(webpage, default=None) or self._html_extract_title(webpage)
return self.playlist_from_matches(
re.findall(GetCourseRuPlayerIE._EMBED_REGEX[0], webpage),

View File

@ -7,161 +7,157 @@
int_or_none,
join_nonempty,
parse_age_limit,
remove_end,
remove_start,
traverse_obj,
try_get,
unified_timestamp,
urlencode_postdata,
)
from ..utils.traversal import traverse_obj
class GoIE(AdobePassIE):
_SITE_INFO = {
'abc': {
'brand': '001',
'requestor_id': 'ABC',
'requestor_id': 'dtci',
'provider_id': 'ABC',
'software_statement': 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiI4OTcwMjlkYS0yYjM1LTQyOWUtYWQ0NS02ZjZiZjVkZTdhOTUiLCJuYmYiOjE2MjAxNzM5NjksImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNjIwMTczOTY5fQ.SC69DVJWSL8sIe-vVUrP6xS_kzHKqwz9PdKYexs_y-f7Vin6mM-7S-W1TE_-K55O0pyf-TL4xYgvm6LIye8CckG-nZfVwNPV4huduov0jmIcxCQFeUwkHULG2IaA44wfBVUBdaHgkhPweZ2amjycO_IXtez-gBXOLbE3B7Gx9j_5ISCFtyVUblThKfoGyQv6KT6t8Vpmc4ZSKCCQp74KWFFypydb9ucego1taW_nQD06Cdf4yByLd6NaTBceMcIKbug9b9gxFm3XBgJ5q3z7KGo1Kr6XalAV5j4m-fQ91wczlTilX8FM4AljMupyRM9mA_aEADILQ4hS79q4SM0w6w',
},
'freeform': {
'brand': '002',
'requestor_id': 'ABCFamily',
},
'watchdisneychannel': {
'brand': '004',
'resource_id': 'Disney',
},
'watchdisneyjunior': {
'brand': '008',
'resource_id': 'DisneyJunior',
},
'watchdisneyxd': {
'brand': '009',
'resource_id': 'DisneyXD',
'provider_id': 'ABCFamily',
'software_statement': 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJhZWM2MGYyNC0xYzRjLTQ1NzQtYjc0Zi03ZmM4N2E5YWMzMzgiLCJuYmYiOjE1ODc2NjU5MjMsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTg3NjY1OTIzfQ.flCn3dhvmvPnWmV0JV8Fm0YFyj07yPez9-n1GFEwVIm_S2wQVWbWyJhqsAyLZVFrhOMZYTqmPS3OHxGwTwXkEYn6PD7o_vIVG3oqi-Xn1m5jRt_Gazw5qEtpat6VE7bvKGSD3ZhcidOrsCk8NcYyq75u61NHDvSl81pcedJjVRVUpsqrEwmo0aVbA0C8PX3ri0mEbGvkMKvHn8E60xp-PSE-VK8SDT0plwPu_TwUszkZ6-_I8_2xcv_WBqcXFkAVg7Q-iNJXgQvmNsrpcrYuLvi6hEH4ZLtoDcXU6MhwTQAJTiHSo8x9aHX1_qFP09CzlNOFQbC2ZEJdP9SvA53SLQ',
},
'disneynow': {
'brand': '011',
'brand': '011', # also: '004', '008', '009'
'requestor_id': 'DisneyChannels',
'provider_id': 'DisneyChannels',
'software_statement': 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiI1MzAzNTRiOS04NDNiLTRkNjAtYTQ3ZS0yNzk1MzlkOTIyNTciLCJuYmYiOjE1NTg5ODc0NDksImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTU4OTg3NDQ5fQ.Jud6YS6-J2h0h6po0oMheDym0qRTJQGj4kzacrz4DFuEwhcBkkykW6pF5pKuAUJy9HCZ40oDAHe2KcTlDJjCZF5tDaUEfdihakZ9cC_rG7MU-QoRne8qaB_dPDKwGuk-ZyWD8eV3zwTJmbGo8hDxYTEU81YNCxwhyc_BPDr5TYiubbmpP3_pTnXmSpuL58isJ2peSKWlX9BacuXtBY25c_QnPFKk-_EETm7IHkTpDazde1QfHWGu4s4yJpKGk8RVVujVG6h6ELlL-ZeYLilBm7iS7h1TYG1u7fJhyZRL7isaom6NvAzsvN3ngss1fLwt8decP8wzdFHrbYTdTjW8qw',
'resource_id': 'Disney',
},
'fxnow.fxnetworks': {
'brand': '025',
'fxnetworks': {
'brand': '025', # also: '020'
'requestor_id': 'dtci',
'provider_id': 'fx', # also 'fxx', 'fxm'
'software_statement': 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiIzYWRhYWZiNC02OTAxLTRlYzktOTdmNy1lYWZkZTJkODJkN2EiLCJuYmYiOjE1NjIwMjQwNzYsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTYyMDI0MDc2fQ.dhKMpZK50AObbZYrMiYPSfWtzXHUaeMP3jrIY4Cgfvh0GaEgk0Mns_zp78jypFeZgRtPVleQMQDNq2YEloRLcAGqP1aa6WVDglnK77ZWUm4IKai14Rwf3A6YBhSRoO2_lMmUGkuTf6gZY-kMIPqBYKqzTQiQl4HbniPFodIzFRiuI9QJVrkoyTGrJL4oqiX08PoFI3Z-TOti1Heu3EbFC-GveQHhlinYrzU7rbiAqLEz7FImtfBDsnXX1Y3uJDLYM3Bq4Oh0nrzTv1Fd62wNsCNErHHIbELidh1zZF0ujvt7ReuZUwAitm0UhEJ7OxNOUbEQWtae6pVNscvdvTFMpg',
},
'nationalgeographic': {
'brand': '026', # also '023'
'requestor_id': 'dtci',
'provider_id': 'ngc', # also 'ngw'
'software_statement': 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiIxMzE4YTM1Ni05Mjc4LTQ4NjEtYTFmNi1jMTIzMzg1ZWMzYzMiLCJuYmYiOjE1NjIwMjM4MjgsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTYyMDIzODI4fQ.Le-2OzF9-jrhJ7ZfWtLWk5iSHGVZoxeU1w0_fO--Heli0OwRZsRq2slSmx-oZTzxuWmAgDEiBkWSDcDK6sM25DrCLsdsJa3MBuZ-slBRtH8aq3HpNoqqLkU-vg6gRUEKMtwBUtwCu_9aKUCayYtndWv4b1DjVQeSrteOW5NNudWVYleAe0kxeNJQHo5If9SCzDudKVJktFUjhNks4QPOC_uONPkRRlL9D0fNvtOY-LRFckfcHhf5z9l1iZjeukV0YhdKnuw1wyiaWrQXBUDiBfbkCRd2DM-KnelqPxfiXCaTjGKDURRBO3pz33ebge3IFXSiU5vl4qHQ8xvunzGpFw',
},
}
_VALID_URL = r'''(?x)
https?://
(?P<sub_domain>
(?:{}\.)?go|fxnow\.fxnetworks|
(?:www\.)?(?:abc|freeform|disneynow)
)\.com/
(?:
(?:[^/]+/)*(?P<id>[Vv][Dd][Kk][Aa]\w+)|
(?:[^/]+/)*(?P<display_id>[^/?\#]+)
)
'''.format(r'\.|'.join(list(_SITE_INFO.keys())))
_URL_PATH_RE = r'(?:video|episode|movies-and-specials)/(?P<id>[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12})'
_VALID_URL = [
fr'https?://(?:www\.)?(?P<site>abc)\.com/{_URL_PATH_RE}',
fr'https?://(?:www\.)?(?P<site>freeform)\.com/{_URL_PATH_RE}',
fr'https?://(?:www\.)?(?P<site>disneynow)\.com/{_URL_PATH_RE}',
fr'https?://fxnow\.(?P<site>fxnetworks)\.com/{_URL_PATH_RE}',
fr'https?://(?:www\.)?(?P<site>nationalgeographic)\.com/tv/{_URL_PATH_RE}',
]
_TESTS = [{
'url': 'http://abc.go.com/shows/designated-survivor/video/most-recent/VDKA3807643',
'url': 'https://abc.com/episode/4192c0e6-26e5-47a8-817b-ce8272b9e440/playlist/PL551127435',
'info_dict': {
'id': 'VDKA3807643',
'id': 'VDKA10805898',
'ext': 'mp4',
'title': 'The Traitor in the White House',
'description': 'md5:05b009d2d145a1e85d25111bd37222e8',
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'This content is no longer available.',
}, {
'url': 'https://disneynow.com/shows/big-hero-6-the-series',
'info_dict': {
'title': 'Doraemon',
'id': 'SH55574025',
},
'playlist_mincount': 51,
}, {
'url': 'http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood',
'info_dict': {
'id': 'VDKA3609139',
'title': 'This Guilty Blood',
'description': 'md5:f18e79ad1c613798d95fdabfe96cd292',
'title': 'Switch the Flip',
'description': 'To help get Brians life in order, Stewie and Brian swap bodies using a machine that Stewie invents.',
'age_limit': 14,
'duration': 1297,
'thumbnail': r're:https?://.+/.+\.jpg',
'series': 'Family Guy',
'season': 'Season 16',
'season_number': 16,
'episode': 'Episode 17',
'episode_number': 17,
'timestamp': 1746082800.0,
'upload_date': '20250501',
},
'params': {'skip_download': 'm3u8'},
'skip': 'This video requires AdobePass MSO credentials',
}, {
'url': 'https://disneynow.com/episode/21029660-ba06-4406-adb0-a9a78f6e265e/playlist/PL553044961',
'info_dict': {
'id': 'VDKA39546942',
'ext': 'mp4',
'title': 'Zero Friends Again',
'description': 'Relationships fray under the pressures of a difficult journey.',
'age_limit': 0,
'duration': 1721,
'thumbnail': r're:https?://.+/.+\.jpg',
'series': 'Star Wars: Skeleton Crew',
'season': 'Season 1',
'season_number': 1,
'episode': 'Episode 6',
'episode_number': 6,
'timestamp': 1746946800.0,
'upload_date': '20250511',
},
'params': {'skip_download': 'm3u8'},
'skip': 'This video requires AdobePass MSO credentials',
}, {
'url': 'https://fxnow.fxnetworks.com/episode/09f4fa6f-c293-469e-aebe-32c9ca5842a7/playlist/PL554408064',
'info_dict': {
'id': 'VDKA38112033',
'ext': 'mp4',
'title': 'The Return of Jerry',
'description': 'The vampires long-lost fifth roommate returns. Written by Paul Simms; directed by Kyle Newacheck.',
'age_limit': 17,
'duration': 1493,
'thumbnail': r're:https?://.+/.+\.jpg',
'series': 'What We Do in the Shadows',
'season': 'Season 6',
'season_number': 6,
'episode': 'Episode 1',
'upload_date': '20170102',
'season': 'Season 2',
'thumbnail': 'http://cdn1.edgedatg.com/aws/v2/abcf/Shadowhunters/video/201/ae5f75608d86bf88aa4f9f4aa76ab1b7/579x325-Q100_ae5f75608d86bf88aa4f9f4aa76ab1b7.jpg',
'duration': 2544,
'season_number': 2,
'series': 'Shadowhunters',
'episode_number': 1,
'timestamp': 1483387200,
'ext': 'mp4',
},
'params': {
'geo_bypass_ip_block': '3.244.239.0/24',
# m3u8 download
'skip_download': True,
'timestamp': 1729573200.0,
'upload_date': '20241022',
},
'params': {'skip_download': 'm3u8'},
'skip': 'This video requires AdobePass MSO credentials',
}, {
'url': 'https://abc.com/shows/the-rookie/episode-guide/season-04/12-the-knock',
'url': 'https://www.freeform.com/episode/bda0eaf7-761a-4838-aa44-96f794000844/playlist/PL553044961',
'info_dict': {
'id': 'VDKA26050359',
'title': 'The Knock',
'description': 'md5:0c2947e3ada4c31f28296db7db14aa64',
'age_limit': 14,
'id': 'VDKA39007340',
'ext': 'mp4',
'thumbnail': 'http://cdn1.edgedatg.com/aws/v2/abc/TheRookie/video/412/daf830d06e83b11eaf5c0a299d993ae3/1556x876-Q75_daf830d06e83b11eaf5c0a299d993ae3.jpg',
'episode': 'Episode 12',
'season_number': 4,
'season': 'Season 4',
'timestamp': 1642975200,
'episode_number': 12,
'upload_date': '20220123',
'series': 'The Rookie',
'duration': 2572,
},
'params': {
'geo_bypass_ip_block': '3.244.239.0/24',
# m3u8 download
'skip_download': True,
'title': 'Angel\'s Landing',
'description': 'md5:91bf084e785c968fab16734df7313446',
'age_limit': 14,
'duration': 2523,
'thumbnail': r're:https?://.+/.+\.jpg',
'series': 'How I Escaped My Cult',
'season': 'Season 1',
'season_number': 1,
'episode': 'Episode 2',
'episode_number': 2,
'timestamp': 1740038400.0,
'upload_date': '20250220',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://fxnow.fxnetworks.com/shows/better-things/video/vdka12782841',
'url': 'https://www.nationalgeographic.com/tv/episode/ca694661-1186-41ae-8089-82f64d69b16d/playlist/PL554408064',
'info_dict': {
'id': 'VDKA12782841',
'title': 'First Look: Better Things - Season 2',
'description': 'md5:fa73584a95761c605d9d54904e35b407',
'id': 'VDKA39492078',
'ext': 'mp4',
'age_limit': 14,
'upload_date': '20170825',
'duration': 161,
'series': 'Better Things',
'thumbnail': 'http://cdn1.edgedatg.com/aws/v2/fx/BetterThings/video/12782841/b6b05e58264121cc2c98811318e6d507/1556x876-Q75_b6b05e58264121cc2c98811318e6d507.jpg',
'timestamp': 1503661074,
},
'params': {
'geo_bypass_ip_block': '3.244.239.0/24',
# m3u8 download
'skip_download': True,
'title': 'Heart of the Emperors',
'description': 'md5:4fc50a2878f030bb3a7eac9124dca677',
'age_limit': 0,
'duration': 2775,
'thumbnail': r're:https?://.+/.+\.jpg',
'series': 'Secrets of the Penguins',
'season': 'Season 1',
'season_number': 1,
'episode': 'Episode 1',
'episode_number': 1,
'timestamp': 1745204400.0,
'upload_date': '20250421',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://abc.go.com/shows/the-catch/episode-guide/season-01/10-the-wedding',
'url': 'https://www.freeform.com/movies-and-specials/c38281fc-9f8f-47c7-8220-22394f9df2e1',
'only_matching': True,
}, {
'url': 'http://abc.go.com/shows/world-news-tonight/episode-guide/2017-02/17-021717-intense-stand-off-between-man-with-rifle-and-police-in-oakland',
'only_matching': True,
}, {
# brand 004
'url': 'http://disneynow.go.com/shows/big-hero-6-the-series/season-01/episode-10-mr-sparkles-loses-his-sparkle/vdka4637915',
'only_matching': True,
}, {
# brand 008
'url': 'http://disneynow.go.com/shows/minnies-bow-toons/video/happy-campers/vdka4872013',
'only_matching': True,
}, {
'url': 'https://disneynow.com/shows/minnies-bow-toons/video/happy-campers/vdka4872013',
'only_matching': True,
}, {
'url': 'https://www.freeform.com/shows/cruel-summer/episode-guide/season-01/01-happy-birthday-jeanette-turner',
'url': 'https://abc.com/video/219a454a-172c-41bf-878a-d169e6bc0bdc/playlist/PL5523098420',
'only_matching': True,
}]
@ -171,58 +167,29 @@ def _extract_videos(self, brand, video_id='-1', show_id='-1'):
f'http://api.contents.watchabc.go.com/vp2/ws/contents/3000/videos/{brand}/001/-1/{show_id}/-1/{video_id}/-1/-1.json',
display_id)['video']
def _extract_global_var(self, name, webpage, video_id):
return self._search_json(
fr'window\[["\']{re.escape(name)}["\']\]\s*=',
webpage, f'{name.strip("_")} JSON', video_id)
def _real_extract(self, url):
mobj = self._match_valid_url(url)
sub_domain = remove_start(remove_end(mobj.group('sub_domain') or '', '.go'), 'www.')
video_id, display_id = mobj.group('id', 'display_id')
site_info = self._SITE_INFO.get(sub_domain, {})
brand = site_info.get('brand')
if not video_id or not site_info:
webpage = self._download_webpage(url, display_id or video_id)
data = self._parse_json(
self._search_regex(
r'["\']__abc_com__["\']\s*\]\s*=\s*({.+?})\s*;', webpage,
'data', default='{}'),
display_id or video_id, fatal=False)
# https://abc.com/shows/modern-family/episode-guide/season-01/101-pilot
layout = try_get(data, lambda x: x['page']['content']['video']['layout'], dict)
video_id = None
if layout:
video_id = try_get(
layout,
(lambda x: x['videoid'], lambda x: x['video']['id']),
str)
site, display_id = self._match_valid_url(url).group('site', 'id')
webpage = self._download_webpage(url, display_id)
config = self._extract_global_var('__CONFIG__', webpage, display_id)
data = self._extract_global_var(config['globalVar'], webpage, display_id)
video_id = traverse_obj(data, (
'page', 'content', 'video', 'layout', (('video', 'id'), 'videoid'), {str}, any))
if not video_id:
video_id = self._search_regex(
(
# There may be inner quotes, e.g. data-video-id="'VDKA3609139'"
# from http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood
r'data-video-id=["\']*(VDKA\w+)',
video_id = self._search_regex([
# data-track-video_id="VDKA39492078"
# data-track-video_id_code="vdka39492078"
# data-video-id="'VDKA3609139'"
r'data-(?:track-)?video[_-]id(?:_code)?=["\']*((?:vdka|VDKA)\d+)',
# page.analytics.videoIdCode
r'\bvideoIdCode["\']\s*:\s*["\']((?:vdka|VDKA)\w+)',
# https://abc.com/shows/the-rookie/episode-guide/season-02/03-the-bet
r'\b(?:video)?id["\']\s*:\s*["\'](VDKA\w+)',
), webpage, 'video id', default=video_id)
if not site_info:
brand = self._search_regex(
(r'data-brand=\s*["\']\s*(\d+)',
r'data-page-brand=\s*["\']\s*(\d+)'), webpage, 'brand',
default='004')
site_info = next(
si for _, si in self._SITE_INFO.items()
if si.get('brand') == brand)
if not video_id:
# show extraction works for Disney, DisneyJunior and DisneyXD
# ABC and Freeform has different layout
show_id = self._search_regex(r'data-show-id=["\']*(SH\d+)', webpage, 'show id')
videos = self._extract_videos(brand, show_id=show_id)
show_title = self._search_regex(r'data-show-title="([^"]+)"', webpage, 'show title', fatal=False)
entries = []
for video in videos:
entries.append(self.url_result(
video['url'], 'Go', video.get('id'), video.get('title')))
entries.reverse()
return self.playlist_result(entries, show_id, show_title)
r'\bvideoIdCode["\']\s*:\s*["\']((?:vdka|VDKA)\d+)'], webpage, 'video ID')
site_info = self._SITE_INFO[site]
brand = site_info['brand']
video_data = self._extract_videos(brand, video_id)[0]
video_id = video_data['id']
title = video_data['title']
@ -238,26 +205,31 @@ def _real_extract(self, url):
if ext == 'm3u8':
video_type = video_data.get('type')
data = {
'video_id': video_data['id'],
'video_id': video_id,
'video_type': video_type,
'brand': brand,
'device': '001',
'app_name': 'webplayer-abc',
}
if video_data.get('accesslevel') == '1':
requestor_id = site_info.get('requestor_id', 'DisneyChannels')
provider_id = site_info['provider_id']
software_statement = traverse_obj(data, ('app', 'config', (
('features', 'auth', 'softwareStatement'),
('tvAuth', 'SOFTWARE_STATEMENTS', 'PRODUCTION'),
), {str}, any)) or site_info['software_statement']
resource = site_info.get('resource_id') or self._get_mvpd_resource(
requestor_id, title, video_id, None)
provider_id, title, video_id, None)
auth = self._extract_mvpd_auth(
url, video_id, requestor_id, resource)
url, video_id, site_info['requestor_id'], resource, software_statement)
data.update({
'token': auth,
'token_type': 'ap',
'adobe_requestor_id': requestor_id,
'adobe_requestor_id': provider_id,
})
else:
self._initialize_geo_bypass({'countries': ['US']})
entitlement = self._download_json(
'https://api.entitlement.watchabc.go.com/vp2/ws-secure/entitlement/2020/authorize.json',
'https://prod.gatekeeper.us-abc.symphony.edgedatg.go.com/vp2/ws-secure/entitlement/2020/playmanifest_secure.json',
video_id, data=urlencode_postdata(data))
errors = entitlement.get('errors', {}).get('errors', [])
if errors:
@ -267,7 +239,7 @@ def _real_extract(self, url):
error['message'], countries=['US'])
error_message = ', '.join([error['message'] for error in errors])
raise ExtractorError(f'{self.IE_NAME} said: {error_message}', expected=True)
asset_url += '?' + entitlement['uplynkData']['sessionKey']
asset_url += '?' + entitlement['entitlement']['uplynkData']['sessionKey']
fmts, subs = self._extract_m3u8_formats_and_subtitles(
asset_url, video_id, 'mp4', m3u8_id=format_id or 'hls', fatal=False)
formats.extend(fmts)

View File

@ -6,7 +6,7 @@
)
class HSEShowBaseInfoExtractor(InfoExtractor):
class HSEShowBaseIE(InfoExtractor):
_GEO_COUNTRIES = ['DE']
def _extract_redux_data(self, url, video_id):
@ -28,7 +28,7 @@ def _extract_formats_and_subtitles(self, sources, video_id):
return formats, subtitles
class HSEShowIE(HSEShowBaseInfoExtractor):
class HSEShowIE(HSEShowBaseIE):
_VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/c/tv-shows/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.hse.de/dpl/c/tv-shows/505350',
@ -64,7 +64,7 @@ def _real_extract(self, url):
}
class HSEProductIE(HSEShowBaseInfoExtractor):
class HSEProductIE(HSEShowBaseIE):
_VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/p/product/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.hse.de/dpl/p/product/408630',

View File

@ -1,5 +1,13 @@
from .common import InfoExtractor
from ..utils import ExtractorError, str_or_none, traverse_obj, unified_strdate
from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
traverse_obj,
unified_strdate,
url_or_none,
)
class IchinanaLiveIE(InfoExtractor):
@ -157,3 +165,51 @@ def _real_extract(self, url):
'description': view_data.get('caption'),
'upload_date': unified_strdate(str_or_none(view_data.get('createdAt'))),
}
class IchinanaLiveVODIE(InfoExtractor):
IE_NAME = '17live:vod'
_VALID_URL = r'https?://(?:www\.)?17\.live/ja/vod/[^/?#]+/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://17.live/ja/vod/27323042/2cf84520-e65e-4b22-891e-1d3a00b0f068',
'md5': '3299b930d7457b069639486998a89580',
'info_dict': {
'id': '2cf84520-e65e-4b22-891e-1d3a00b0f068',
'ext': 'mp4',
'title': 'md5:b5f8cbf497d54cc6a60eb3b480182f01',
'uploader': 'md5:29fb12122ab94b5a8495586e7c3085a5',
'uploader_id': '27323042',
'channel': '🌟オールナイトニッポン アーカイブ🌟',
'channel_id': '2b4f85f1-d61e-429d-a901-68d32bdd8645',
'like_count': int,
'view_count': int,
'thumbnail': r're:https?://.+/.+\.(?:jpe?g|png)',
'duration': 549,
'description': 'md5:116f326579700f00eaaf5581aae1192e',
'timestamp': 1741058645,
'upload_date': '20250304',
},
}, {
'url': 'https://17.live/ja/vod/27323042/0de11bac-9bea-40b8-9eab-0239a7d88079',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
json_data = self._download_json(f'https://wap-api.17app.co/api/v1/vods/{video_id}', video_id)
return traverse_obj(json_data, {
'id': ('vodID', {str}),
'title': ('title', {str}),
'formats': ('vodURL', {lambda x: self._extract_m3u8_formats(x, video_id)}),
'uploader': ('userInfo', 'displayName', {str}),
'uploader_id': ('userInfo', 'roomID', {int}, {str_or_none}),
'channel': ('userInfo', 'name', {str}),
'channel_id': ('userInfo', 'userID', {str}),
'like_count': ('likeCount', {int_or_none}),
'view_count': ('viewCount', {int_or_none}),
'thumbnail': ('imageURL', {url_or_none}),
'duration': ('duration', {int_or_none}),
'description': ('description', {str}),
'timestamp': ('createdAt', {int_or_none}),
})

View File

@ -1,3 +1,4 @@
import json
import re
import time
@ -6,9 +7,7 @@
ExtractorError,
determine_ext,
js_to_json,
parse_qs,
traverse_obj,
urlencode_postdata,
)
@ -16,7 +15,6 @@ class IPrimaIE(InfoExtractor):
_VALID_URL = r'https?://(?!cnn)(?:[^/]+)\.iprima\.cz/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_GEO_BYPASS = False
_NETRC_MACHINE = 'iprima'
_AUTH_ROOT = 'https://auth.iprima.cz'
access_token = None
_TESTS = [{
@ -86,48 +84,18 @@ def _perform_login(self, username, password):
if self.access_token:
return
login_page = self._download_webpage(
f'{self._AUTH_ROOT}/oauth2/login', None, note='Downloading login page',
errnote='Downloading login page failed')
login_form = self._hidden_inputs(login_page)
login_form.update({
'_email': username,
'_password': password})
profile_select_html, login_handle = self._download_webpage_handle(
f'{self._AUTH_ROOT}/oauth2/login', None, data=urlencode_postdata(login_form),
note='Logging in')
# a profile may need to be selected first, even when there is only a single one
if '/profile-select' in login_handle.url:
profile_id = self._search_regex(
r'data-identifier\s*=\s*["\']?(\w+)', profile_select_html, 'profile id')
login_handle = self._request_webpage(
f'{self._AUTH_ROOT}/user/profile-select-perform/{profile_id}', None,
query={'continueUrl': '/user/login?redirect_uri=/user/'}, note='Selecting profile')
code = traverse_obj(login_handle.url, ({parse_qs}, 'code', 0))
if not code:
raise ExtractorError('Login failed', expected=True)
token_request_data = {
'scope': 'openid+email+profile+phone+address+offline_access',
'client_id': 'prima_sso',
'grant_type': 'authorization_code',
'code': code,
'redirect_uri': f'{self._AUTH_ROOT}/sso/auth-check'}
token_data = self._download_json(
f'{self._AUTH_ROOT}/oauth2/token', None,
note='Downloading token', errnote='Downloading token failed',
data=urlencode_postdata(token_request_data))
'https://ucet.iprima.cz/api/session/create', None,
note='Logging in', errnote='Failed to log in',
data=json.dumps({
'email': username,
'password': password,
'deviceName': 'Windows Chrome',
}).encode(), headers={'content-type': 'application/json'})
self.access_token = token_data.get('access_token')
if self.access_token is None:
raise ExtractorError('Getting token failed', expected=True)
self.access_token = token_data['accessToken']['value']
if not self.access_token:
raise ExtractorError('Failed to fetch access token')
def _real_initialize(self):
if not self.access_token:

78
yt_dlp/extractor/ivoox.py Normal file
View File

@ -0,0 +1,78 @@
from .common import InfoExtractor
from ..utils import int_or_none, parse_iso8601, url_or_none, urljoin
from ..utils.traversal import traverse_obj
class IvooxIE(InfoExtractor):
_VALID_URL = (
r'https?://(?:www\.)?ivoox\.com/(?:\w{2}/)?[^/?#]+_rf_(?P<id>[0-9]+)_1\.html',
r'https?://go\.ivoox\.com/rf/(?P<id>[0-9]+)',
)
_TESTS = [{
'url': 'https://www.ivoox.com/dex-08x30-rostros-del-mal-los-asesinos-en-audios-mp3_rf_143594959_1.html',
'md5': '993f712de5b7d552459fc66aa3726885',
'info_dict': {
'id': '143594959',
'ext': 'mp3',
'timestamp': 1742731200,
'channel': 'DIAS EXTRAÑOS con Santiago Camacho',
'title': 'DEx 08x30 Rostros del mal: Los asesinos en serie que aterrorizaron España',
'description': 'md5:eae8b4b9740d0216d3871390b056bb08',
'uploader': 'Santiago Camacho',
'thumbnail': 'https://static-1.ivoox.com/audios/c/d/5/2/cd52f46783fe735000c33a803dce2554_XXL.jpg',
'upload_date': '20250323',
'episode': 'DEx 08x30 Rostros del mal: Los asesinos en serie que aterrorizaron España',
'duration': 11837,
'tags': ['españa', 'asesinos en serie', 'arropiero', 'historia criminal', 'mataviejas'],
},
}, {
'url': 'https://go.ivoox.com/rf/143594959',
'only_matching': True,
}, {
'url': 'https://www.ivoox.com/en/campodelgas-28-03-2025-audios-mp3_rf_144036942_1.html',
'only_matching': True,
}]
def _real_extract(self, url):
media_id = self._match_id(url)
webpage = self._download_webpage(url, media_id, fatal=False)
data = self._search_nuxt_data(
webpage, media_id, fatal=False, traverse=('data', 0, 'data', 'audio'))
direct_download = self._download_json(
f'https://vcore-web.ivoox.com/v1/public/audios/{media_id}/download-url', media_id, fatal=False,
note='Fetching direct download link', headers={'Referer': url})
download_paths = {
*traverse_obj(direct_download, ('data', 'downloadUrl', {str}, filter, all)),
*traverse_obj(data, (('downloadUrl', 'mediaUrl'), {str}, filter)),
}
formats = []
for path in download_paths:
formats.append({
'url': urljoin('https://ivoox.com', path),
'http_headers': {'Referer': url},
})
return {
'id': media_id,
'formats': formats,
'uploader': self._html_search_regex(r'data-prm-author="([^"]+)"', webpage, 'author', default=None),
'timestamp': parse_iso8601(
self._html_search_regex(r'data-prm-pubdate="([^"]+)"', webpage, 'timestamp', default=None)),
'channel': self._html_search_regex(r'data-prm-podname="([^"]+)"', webpage, 'channel', default=None),
'title': self._html_search_regex(r'data-prm-title="([^"]+)"', webpage, 'title', default=None),
'thumbnail': self._og_search_thumbnail(webpage, default=None),
'description': self._og_search_description(webpage, default=None),
**self._search_json_ld(webpage, media_id, default={}),
**traverse_obj(data, {
'title': ('title', {str}),
'description': ('description', {str}),
'thumbnail': ('image', {url_or_none}),
'timestamp': ('uploadDate', {parse_iso8601(delimiter=' ')}),
'duration': ('duration', {int_or_none}),
'tags': ('tags', ..., 'name', {str}),
}),
}

View File

@ -2,10 +2,12 @@
import random
from .common import InfoExtractor
from ..networking import HEADRequest
from ..utils import (
clean_html,
int_or_none,
try_get,
urlhandle_detect_ext,
)
@ -27,7 +29,7 @@ class JamendoIE(InfoExtractor):
'ext': 'flac',
# 'title': 'Maya Filipič - Stories from Emona I',
'title': 'Stories from Emona I',
'artist': 'Maya Filipič',
'artists': ['Maya Filipič'],
'album': 'Between two worlds',
'track': 'Stories from Emona I',
'duration': 210,
@ -93,9 +95,15 @@ def _real_extract(self, url):
if not cover_url or cover_url in urls:
continue
urls.append(cover_url)
urlh = self._request_webpage(
HEADRequest(cover_url), track_id, 'Checking thumbnail extension',
errnote=False, fatal=False)
if not urlh:
continue
size = int_or_none(cover_id.lstrip('size'))
thumbnails.append({
'id': cover_id,
'ext': urlhandle_detect_ext(urlh, default='jpg'),
'url': cover_url,
'width': size,
'height': size,

View File

@ -1,23 +1,33 @@
import functools
import itertools
import math
import re
from .common import InfoExtractor
from ..utils import (
InAdvancePagedList,
ISO639Utils,
OnDemandPagedList,
clean_html,
int_or_none,
js_to_json,
make_archive_id,
orderedSet,
smuggle_url,
unified_strdate,
unified_timestamp,
unsmuggle_url,
url_basename,
url_or_none,
urlencode_postdata,
urljoin,
variadic,
)
from ..utils.traversal import traverse_obj
class JioSaavnBaseIE(InfoExtractor):
_URL_BASE_RE = r'https?://(?:www\.)?(?:jio)?saavn\.com'
_API_URL = 'https://www.jiosaavn.com/api.php'
_VALID_BITRATES = {'16', '32', '64', '128', '320'}
@ -30,16 +40,20 @@ def requested_bitrates(self):
f'Valid bitrates are: {", ".join(sorted(self._VALID_BITRATES, key=int))}')
return requested_bitrates
def _extract_formats(self, song_data):
def _extract_formats(self, item_data):
# Show/episode JSON data has a slightly different structure than song JSON data
if media_url := traverse_obj(item_data, ('more_info', 'encrypted_media_url', {str})):
item_data.setdefault('encrypted_media_url', media_url)
for bitrate in self.requested_bitrates:
media_data = self._download_json(
self._API_URL, song_data['id'],
self._API_URL, item_data['id'],
f'Downloading format info for {bitrate}',
fatal=False, data=urlencode_postdata({
'__call': 'song.generateAuthToken',
'_format': 'json',
'bitrate': bitrate,
'url': song_data['encrypted_media_url'],
'url': item_data['encrypted_media_url'],
}))
if not traverse_obj(media_data, ('auth_url', {url_or_none})):
self.report_warning(f'Unable to extract format info for {bitrate}')
@ -53,24 +67,6 @@ def _extract_formats(self, song_data):
'vcodec': 'none',
}
def _extract_song(self, song_data, url=None):
info = traverse_obj(song_data, {
'id': ('id', {str}),
'title': ('song', {clean_html}),
'album': ('album', {clean_html}),
'thumbnail': ('image', {url_or_none}, {lambda x: re.sub(r'-\d+x\d+\.', '-500x500.', x)}),
'duration': ('duration', {int_or_none}),
'view_count': ('play_count', {int_or_none}),
'release_year': ('year', {int_or_none}),
'artists': ('primary_artists', {lambda x: x.split(', ') if x else None}),
'webpage_url': ('perma_url', {url_or_none}),
})
if webpage_url := info.get('webpage_url') or url:
info['display_id'] = url_basename(webpage_url)
info['_old_archive_ids'] = [make_archive_id(JioSaavnSongIE, info['display_id'])]
return info
def _call_api(self, type_, token, note='API', params={}):
return self._download_json(
self._API_URL, token, f'Downloading {note} JSON', f'Unable to download {note} JSON',
@ -84,19 +80,89 @@ def _call_api(self, type_, token, note='API', params={}):
**params,
})
def _yield_songs(self, playlist_data):
for song_data in traverse_obj(playlist_data, ('songs', lambda _, v: v['id'] and v['perma_url'])):
song_info = self._extract_song(song_data)
url = smuggle_url(song_info['webpage_url'], {
'id': song_data['id'],
'encrypted_media_url': song_data['encrypted_media_url'],
@staticmethod
def _extract_song(song_data, url=None):
info = traverse_obj(song_data, {
'id': ('id', {str}),
'title': (('song', 'title'), {clean_html}, any),
'album': ((None, 'more_info'), 'album', {clean_html}, any),
'duration': ((None, 'more_info'), 'duration', {int_or_none}, any),
'channel': ((None, 'more_info'), 'label', {str}, any),
'channel_id': ((None, 'more_info'), 'label_id', {str}, any),
'channel_url': ((None, 'more_info'), 'label_url', {urljoin('https://www.jiosaavn.com/')}, any),
'release_date': ((None, 'more_info'), 'release_date', {unified_strdate}, any),
'release_year': ('year', {int_or_none}),
'thumbnail': ('image', {url_or_none}, {lambda x: re.sub(r'-\d+x\d+\.', '-500x500.', x)}),
'view_count': ('play_count', {int_or_none}),
'language': ('language', {lambda x: ISO639Utils.short2long(x.casefold()) or 'und'}),
'webpage_url': ('perma_url', {url_or_none}),
'artists': ('more_info', 'artistMap', 'primary_artists', ..., 'name', {str}, filter, all),
})
yield self.url_result(url, JioSaavnSongIE, url_transparent=True, **song_info)
if webpage_url := info.get('webpage_url') or url:
info['display_id'] = url_basename(webpage_url)
info['_old_archive_ids'] = [make_archive_id(JioSaavnSongIE, info['display_id'])]
if primary_artists := traverse_obj(song_data, ('primary_artists', {lambda x: x.split(', ') if x else None})):
info['artists'].extend(primary_artists)
if featured_artists := traverse_obj(song_data, ('featured_artists', {str}, filter)):
info['artists'].extend(featured_artists.split(', '))
info['artists'] = orderedSet(info['artists']) or None
return info
@staticmethod
def _extract_episode(episode_data, url=None):
info = JioSaavnBaseIE._extract_song(episode_data, url)
info.pop('_old_archive_ids', None)
info.update(traverse_obj(episode_data, {
'description': ('more_info', 'description', {str}),
'timestamp': ('more_info', 'release_time', {unified_timestamp}),
'series': ('more_info', 'show_title', {str}),
'series_id': ('more_info', 'show_id', {str}),
'season': ('more_info', 'season_title', {str}),
'season_number': ('more_info', 'season_no', {int_or_none}),
'season_id': ('more_info', 'season_id', {str}),
'episode_number': ('more_info', 'episode_number', {int_or_none}),
'cast': ('starring', {lambda x: x.split(', ') if x else None}),
}))
return info
def _extract_jiosaavn_result(self, url, endpoint, response_key, parse_func):
url, smuggled_data = unsmuggle_url(url)
data = traverse_obj(smuggled_data, ({
'id': ('id', {str}),
'encrypted_media_url': ('encrypted_media_url', {str}),
}))
if 'id' in data and 'encrypted_media_url' in data:
result = {'id': data['id']}
else:
# only extract metadata if this is not a url_transparent result
data = self._call_api(endpoint, self._match_id(url))[response_key][0]
result = parse_func(data, url)
result['formats'] = list(self._extract_formats(data))
return result
def _yield_items(self, playlist_data, keys=None, parse_func=None):
"""Subclasses using this method must set _ENTRY_IE"""
if parse_func is None:
parse_func = self._extract_song
for item_data in traverse_obj(playlist_data, (
*variadic(keys, (str, bytes, dict, set)), lambda _, v: v['id'] and v['perma_url'],
)):
info = parse_func(item_data)
url = smuggle_url(info['webpage_url'], traverse_obj(item_data, {
'id': ('id', {str}),
'encrypted_media_url': ((None, 'more_info'), 'encrypted_media_url', {str}, any),
}))
yield self.url_result(url, self._ENTRY_IE, url_transparent=True, **info)
class JioSaavnSongIE(JioSaavnBaseIE):
IE_NAME = 'jiosaavn:song'
_VALID_URL = r'https?://(?:www\.)?(?:jiosaavn\.com/song/[^/?#]+/|saavn\.com/s/song/(?:[^/?#]+/){3})(?P<id>[^/?#]+)'
_VALID_URL = JioSaavnBaseIE._URL_BASE_RE + r'(?:/song/[^/?#]+/|/s/song/(?:[^/?#]+/){3})(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.jiosaavn.com/song/leja-re/OQsEfQFVUXk',
'md5': '3b84396d15ed9e083c3106f1fa589c04',
@ -106,12 +172,38 @@ class JioSaavnSongIE(JioSaavnBaseIE):
'ext': 'm4a',
'title': 'Leja Re',
'album': 'Leja Re',
'thumbnail': r're:https?://c.saavncdn.com/258/Leja-Re-Hindi-2018-20181124024539-500x500.jpg',
'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 205,
'view_count': int,
'release_year': 2018,
'artists': ['Sandesh Shandilya', 'Dhvani Bhanushali', 'Tanishk Bagchi'],
'_old_archive_ids': ['jiosaavnsong OQsEfQFVUXk'],
'channel': 'T-Series',
'language': 'hin',
'channel_id': '34297',
'channel_url': 'https://www.jiosaavn.com/label/t-series-albums/6DLuXO3VoTo_',
'release_date': '20181124',
},
}, {
'url': 'https://www.jiosaavn.com/song/chuttamalle/P1FfWjZkQ0Q',
'md5': '96296c58d6ce488a417ef0728fd2d680',
'info_dict': {
'id': 'O94kBTtw',
'display_id': 'P1FfWjZkQ0Q',
'ext': 'm4a',
'title': 'Chuttamalle',
'album': 'Devara Part 1 - Telugu',
'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 222,
'view_count': int,
'release_year': 2024,
'artists': 'count:3',
'_old_archive_ids': ['jiosaavnsong P1FfWjZkQ0Q'],
'channel': 'T-Series',
'language': 'tel',
'channel_id': '34297',
'channel_url': 'https://www.jiosaavn.com/label/t-series-albums/6DLuXO3VoTo_',
'release_date': '20240926',
},
}, {
'url': 'https://www.saavn.com/s/song/hindi/Saathiya/O-Humdum-Suniyo-Re/KAMiazoCblU',
@ -119,26 +211,51 @@ class JioSaavnSongIE(JioSaavnBaseIE):
}]
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url)
song_data = traverse_obj(smuggled_data, ({
'id': ('id', {str}),
'encrypted_media_url': ('encrypted_media_url', {str}),
}))
return self._extract_jiosaavn_result(url, 'song', 'songs', self._extract_song)
if 'id' in song_data and 'encrypted_media_url' in song_data:
result = {'id': song_data['id']}
else:
# only extract metadata if this is not a url_transparent result
song_data = self._call_api('song', self._match_id(url))['songs'][0]
result = self._extract_song(song_data, url)
result['formats'] = list(self._extract_formats(song_data))
return result
class JioSaavnShowIE(JioSaavnBaseIE):
IE_NAME = 'jiosaavn:show'
_VALID_URL = JioSaavnBaseIE._URL_BASE_RE + r'/shows/[^/?#]+/(?P<id>[^/?#]{11,})/?(?:$|[?#])'
_TESTS = [{
'url': 'https://www.jiosaavn.com/shows/non-food-ways-to-boost-your-energy/XFMcKICOCgc_',
'md5': '0733cd254cfe74ef88bea1eaedcf1f4f',
'info_dict': {
'id': 'qqzh3RKZ',
'display_id': 'XFMcKICOCgc_',
'ext': 'mp3',
'title': 'Non-Food Ways To Boost Your Energy',
'description': 'md5:26e7129644b5c6aada32b8851c3997c8',
'episode': 'Episode 1',
'timestamp': 1640563200,
'series': 'Holistic Lifestyle With Neha Ranglani',
'series_id': '52397',
'season': 'Holistic Lifestyle With Neha Ranglani',
'season_number': 1,
'season_id': '61273',
'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 311,
'view_count': int,
'release_year': 2021,
'language': 'eng',
'channel': 'Saavn OG',
'channel_id': '1953876',
'episode_number': 1,
'upload_date': '20211227',
'release_date': '20211227',
},
}, {
'url': 'https://www.jiosaavn.com/shows/himesh-reshammiya/Kr8fmfSN4vo_',
'only_matching': True,
}]
def _real_extract(self, url):
return self._extract_jiosaavn_result(url, 'episode', 'episodes', self._extract_episode)
class JioSaavnAlbumIE(JioSaavnBaseIE):
IE_NAME = 'jiosaavn:album'
_VALID_URL = r'https?://(?:www\.)?(?:jio)?saavn\.com/album/[^/?#]+/(?P<id>[^/?#]+)'
_VALID_URL = JioSaavnBaseIE._URL_BASE_RE + r'/album/[^/?#]+/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.jiosaavn.com/album/96/buIOjYZDrNA_',
'info_dict': {
@ -147,18 +264,19 @@ class JioSaavnAlbumIE(JioSaavnBaseIE):
},
'playlist_count': 10,
}]
_ENTRY_IE = JioSaavnSongIE
def _real_extract(self, url):
display_id = self._match_id(url)
album_data = self._call_api('album', display_id)
return self.playlist_result(
self._yield_songs(album_data), display_id, traverse_obj(album_data, ('title', {str})))
self._yield_items(album_data, 'songs'), display_id, traverse_obj(album_data, ('title', {str})))
class JioSaavnPlaylistIE(JioSaavnBaseIE):
IE_NAME = 'jiosaavn:playlist'
_VALID_URL = r'https?://(?:www\.)?(?:jio)?saavn\.com/(?:s/playlist/(?:[^/?#]+/){2}|featured/[^/?#]+/)(?P<id>[^/?#]+)'
_VALID_URL = JioSaavnBaseIE._URL_BASE_RE + r'/(?:s/playlist/(?:[^/?#]+/){2}|featured/[^/?#]+/)(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.jiosaavn.com/s/playlist/2279fbe391defa793ad7076929a2f5c9/mood-english/LlJ8ZWT1ibN5084vKHRj2Q__',
'info_dict': {
@ -172,15 +290,16 @@ class JioSaavnPlaylistIE(JioSaavnBaseIE):
'id': 'DVR,pFUOwyXqIp77B1JF,A__',
'title': 'Mood Hindi',
},
'playlist_mincount': 801,
'playlist_mincount': 750,
}, {
'url': 'https://www.jiosaavn.com/featured/taaza-tunes/Me5RridRfDk_',
'info_dict': {
'id': 'Me5RridRfDk_',
'title': 'Taaza Tunes',
},
'playlist_mincount': 301,
'playlist_mincount': 50,
}]
_ENTRY_IE = JioSaavnSongIE
_PAGE_SIZE = 50
def _fetch_page(self, token, page):
@ -189,7 +308,7 @@ def _fetch_page(self, token, page):
def _entries(self, token, first_page_data, page):
page_data = first_page_data if not page else self._fetch_page(token, page + 1)
yield from self._yield_songs(page_data)
yield from self._yield_items(page_data, 'songs')
def _real_extract(self, url):
display_id = self._match_id(url)
@ -199,3 +318,95 @@ def _real_extract(self, url):
return self.playlist_result(InAdvancePagedList(
functools.partial(self._entries, display_id, playlist_data),
total_pages, self._PAGE_SIZE), display_id, traverse_obj(playlist_data, ('listname', {str})))
class JioSaavnShowPlaylistIE(JioSaavnBaseIE):
IE_NAME = 'jiosaavn:show:playlist'
_VALID_URL = JioSaavnBaseIE._URL_BASE_RE + r'/shows/(?P<show>[^#/?]+)/(?P<season>\d+)/[^/?#]+'
_TESTS = [{
'url': 'https://www.jiosaavn.com/shows/talking-music/1/PjReFP-Sguk_',
'info_dict': {
'id': 'talking-music-1',
'title': 'Talking Music',
},
'playlist_mincount': 11,
}]
_ENTRY_IE = JioSaavnShowIE
_PAGE_SIZE = 10
def _fetch_page(self, show_id, season_id, page):
return self._call_api('show', show_id, f'show page {page}', {
'p': page,
'__call': 'show.getAllEpisodes',
'show_id': show_id,
'season_number': season_id,
'api_version': '4',
'sort_order': 'desc',
})
def _entries(self, show_id, season_id, page):
page_data = self._fetch_page(show_id, season_id, page + 1)
yield from self._yield_items(page_data, keys=None, parse_func=self._extract_episode)
def _real_extract(self, url):
show_slug, season_id = self._match_valid_url(url).group('show', 'season')
playlist_id = f'{show_slug}-{season_id}'
webpage = self._download_webpage(url, playlist_id)
show_info = self._search_json(
r'window\.__INITIAL_DATA__\s*=', webpage, 'initial data',
playlist_id, transform_source=js_to_json)['showView']
show_id = show_info['current_id']
entries = OnDemandPagedList(functools.partial(self._entries, show_id, season_id), self._PAGE_SIZE)
return self.playlist_result(
entries, playlist_id, traverse_obj(show_info, ('show', 'title', 'text', {str})))
class JioSaavnArtistIE(JioSaavnBaseIE):
IE_NAME = 'jiosaavn:artist'
_VALID_URL = JioSaavnBaseIE._URL_BASE_RE + r'/artist/[^/?#]+/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.jiosaavn.com/artist/krsna-songs/rYLBEve2z3U_',
'info_dict': {
'id': 'rYLBEve2z3U_',
'title': 'KR$NA',
},
'playlist_mincount': 38,
}, {
'url': 'https://www.jiosaavn.com/artist/sanam-puri-songs/SkNEv3qRhDE_',
'info_dict': {
'id': 'SkNEv3qRhDE_',
'title': 'Sanam Puri',
},
'playlist_mincount': 51,
}]
_ENTRY_IE = JioSaavnSongIE
_PAGE_SIZE = 50
def _fetch_page(self, artist_id, page):
return self._call_api('artist', artist_id, f'artist page {page + 1}', {
'p': page,
'n_song': self._PAGE_SIZE,
'n_album': self._PAGE_SIZE,
'sub_type': '',
'includeMetaTags': '',
'api_version': '4',
'category': 'alphabetical',
'sort_order': 'asc',
})
def _entries(self, artist_id, first_page):
for page in itertools.count():
playlist_data = first_page if not page else self._fetch_page(artist_id, page)
if not traverse_obj(playlist_data, ('topSongs', ..., {dict})):
break
yield from self._yield_items(playlist_data, 'topSongs')
def _real_extract(self, url):
artist_id = self._match_id(url)
first_page = self._fetch_page(artist_id, 0)
return self.playlist_result(
self._entries(artist_id, first_page), artist_id,
traverse_obj(first_page, ('name', {str})))

View File

@ -1,3 +1,5 @@
import itertools
from .common import InfoExtractor
from ..utils import (
determine_ext,
@ -124,3 +126,43 @@ def _extract_formats(self, media_info, video_id):
'vbr': ('bitrateVideo', {int_or_none}, {lambda x: None if x == -1 else x}),
}),
}
class KikaPlaylistIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?kika\.de/[\w-]+/(?P<id>[a-z-]+\d+)'
_TESTS = [{
'url': 'https://www.kika.de/logo/logo-die-welt-und-ich-562',
'info_dict': {
'id': 'logo-die-welt-und-ich-562',
'title': 'logo!',
'description': 'md5:7b9d7f65561b82fa512f2cfb553c397d',
},
'playlist_count': 100,
}]
def _entries(self, playlist_url, playlist_id):
for page in itertools.count(1):
data = self._download_json(playlist_url, playlist_id, note=f'Downloading page {page}')
for item in traverse_obj(data, ('content', lambda _, v: url_or_none(v['api']['url']))):
yield self.url_result(
item['api']['url'], ie=KikaIE,
**traverse_obj(item, {
'id': ('id', {str}),
'title': ('title', {str}),
'duration': ('duration', {int_or_none}),
'timestamp': ('date', {parse_iso8601}),
}))
playlist_url = traverse_obj(data, ('links', 'next', {url_or_none}))
if not playlist_url:
break
def _real_extract(self, url):
playlist_id = self._match_id(url)
brand_data = self._download_json(
f'https://www.kika.de/_next-api/proxy/v1/brands/{playlist_id}', playlist_id)
return self.playlist_result(
self._entries(brand_data['videoSubchannel']['videosPageUrl'], playlist_id),
playlist_id, title=brand_data.get('title'), description=brand_data.get('description'))

View File

@ -1,4 +1,5 @@
import itertools
import json
import re
from .common import InfoExtractor
@ -9,12 +10,12 @@
int_or_none,
mimetype2ext,
srt_subtitles_timecode,
traverse_obj,
try_get,
url_or_none,
urlencode_postdata,
urljoin,
)
from ..utils.traversal import find_elements, require, traverse_obj
class LinkedInBaseIE(InfoExtractor):
@ -82,7 +83,10 @@ def _get_video_id(self, video_data, course_slug, video_slug):
class LinkedInIE(LinkedInBaseIE):
_VALID_URL = r'https?://(?:www\.)?linkedin\.com/posts/[^/?#]+-(?P<id>\d+)-\w{4}/?(?:[?#]|$)'
_VALID_URL = [
r'https?://(?:www\.)?linkedin\.com/posts/[^/?#]+-(?P<id>\d+)-\w{4}/?(?:[?#]|$)',
r'https?://(?:www\.)?linkedin\.com/feed/update/urn:li:activity:(?P<id>\d+)',
]
_TESTS = [{
'url': 'https://www.linkedin.com/posts/mishalkhawaja_sendinblueviews-toronto-digitalmarketing-ugcPost-6850898786781339649-mM20',
'info_dict': {
@ -106,6 +110,9 @@ class LinkedInIE(LinkedInBaseIE):
'like_count': int,
'subtitles': 'mincount:1',
},
}, {
'url': 'https://www.linkedin.com/feed/update/urn:li:activity:7016901149999955968/?utm_source=share&utm_medium=member_desktop',
'only_matching': True,
}]
def _real_extract(self, url):
@ -271,3 +278,110 @@ def _real_extract(self, url):
entries, course_slug,
course_data.get('title'),
course_data.get('description'))
class LinkedInEventsIE(LinkedInBaseIE):
IE_NAME = 'linkedin:events'
_VALID_URL = r'https?://(?:www\.)?linkedin\.com/events/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://www.linkedin.com/events/7084656651378536448/comments/',
'info_dict': {
'id': '7084656651378536448',
'ext': 'mp4',
'title': '#37 Aprende a hacer una entrevista en inglés para tu próximo trabajo remoto',
'description': '¡Agarra para anotar que se viene tremendo evento!',
'duration': 1765,
'timestamp': 1689113772,
'upload_date': '20230711',
'release_timestamp': 1689174012,
'release_date': '20230712',
'live_status': 'was_live',
},
}, {
'url': 'https://www.linkedin.com/events/27-02energyfreedombyenergyclub7295762520814874625/comments/',
'info_dict': {
'id': '27-02energyfreedombyenergyclub7295762520814874625',
'ext': 'mp4',
'title': '27.02 Energy Freedom by Energy Club',
'description': 'md5:1292e6f31df998914c293787a02c3b91',
'duration': 6420,
'timestamp': 1739445333,
'upload_date': '20250213',
'release_timestamp': 1740657620,
'release_date': '20250227',
'live_status': 'was_live',
},
}]
def _real_initialize(self):
if not self._get_cookies('https://www.linkedin.com/').get('li_at'):
self.raise_login_required()
def _real_extract(self, url):
event_id = self._match_id(url)
webpage = self._download_webpage(url, event_id)
base_data = traverse_obj(webpage, (
{find_elements(tag='code', attr='style', value='display: none')}, ..., {json.loads}, 'included', ...))
meta_data = traverse_obj(base_data, (
lambda _, v: v['$type'] == 'com.linkedin.voyager.dash.events.ProfessionalEvent', any)) or {}
live_status = {
'PAST': 'was_live',
'ONGOING': 'is_live',
'FUTURE': 'is_upcoming',
}.get(meta_data.get('lifecycleState'))
if live_status == 'is_upcoming':
player_data = {}
if event_time := traverse_obj(meta_data, ('displayEventTime', {str})):
message = f'This live event is scheduled for {event_time}'
else:
message = 'This live event has not yet started'
self.raise_no_formats(message, expected=True, video_id=event_id)
else:
# TODO: Add support for audio-only live events
player_data = traverse_obj(base_data, (
lambda _, v: v['$type'] == 'com.linkedin.videocontent.VideoPlayMetadata',
any, {require('video player data')}))
formats, subtitles = [], {}
for prog_fmts in traverse_obj(player_data, ('progressiveStreams', ..., {dict})):
for fmt_url in traverse_obj(prog_fmts, ('streamingLocations', ..., 'url', {url_or_none})):
formats.append({
'url': fmt_url,
**traverse_obj(prog_fmts, {
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
'tbr': ('bitRate', {int_or_none(scale=1000)}),
'filesize': ('size', {int_or_none}),
'ext': ('mediaType', {mimetype2ext}),
}),
})
for m3u8_url in traverse_obj(player_data, (
'adaptiveStreams', lambda _, v: v['protocol'] == 'HLS', 'masterPlaylists', ..., 'url', {url_or_none},
)):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
m3u8_url, event_id, 'mp4', m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
'id': event_id,
'formats': formats,
'subtitles': subtitles,
'live_status': live_status,
**traverse_obj(meta_data, {
'title': ('name', {str}),
'description': ('description', 'text', {str}),
'timestamp': ('createdAt', {int_or_none(scale=1000)}),
# timeRange.start is available when the stream is_upcoming
'release_timestamp': ('timeRange', 'start', {int_or_none(scale=1000)}),
}),
**traverse_obj(player_data, {
'duration': ('duration', {int_or_none(scale=1000)}),
# liveStreamCreatedAt is only available when the stream is_live or was_live
'release_timestamp': ('liveStreamCreatedAt', {int_or_none(scale=1000)}),
}),
}

159
yt_dlp/extractor/loco.py Normal file
View File

@ -0,0 +1,159 @@
import json
import random
import time
from .common import InfoExtractor
from ..utils import int_or_none, jwt_decode_hs256, try_call, url_or_none
from ..utils.traversal import require, traverse_obj
class LocoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?loco\.com/(?P<type>streamers|stream)/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://loco.com/streamers/teuzinfps',
'info_dict': {
'id': 'teuzinfps',
'ext': 'mp4',
'title': r're:MS BOLADAO, RESENHA & GAMEPLAY ALTO NIVEL',
'description': 'bom e novo',
'uploader_id': 'RLUVE3S9JU',
'channel': 'teuzinfps',
'channel_follower_count': int,
'comment_count': int,
'view_count': int,
'concurrent_view_count': int,
'like_count': int,
'thumbnail': 'https://static.ivory.getloconow.com/default_thumb/743701a9-98ca-41ae-9a8b-70bd5da070ad.jpg',
'tags': ['MMORPG', 'Gameplay'],
'series': 'Tibia',
'timestamp': int,
'modified_timestamp': int,
'live_status': 'is_live',
'upload_date': str,
'modified_date': str,
},
'params': {
'skip_download': 'Livestream',
},
}, {
'url': 'https://loco.com/stream/c64916eb-10fb-46a9-9a19-8c4b7ed064e7',
'md5': '45ebc8a47ee1c2240178757caf8881b5',
'info_dict': {
'id': 'c64916eb-10fb-46a9-9a19-8c4b7ed064e7',
'ext': 'mp4',
'title': 'PAULINHO LOKO NA LOCO!',
'description': 'live on na loco',
'uploader_id': '2MDO7Z1DPM',
'channel': 'paulinholokobr',
'channel_follower_count': int,
'comment_count': int,
'view_count': int,
'concurrent_view_count': int,
'like_count': int,
'duration': 14491,
'thumbnail': 'https://static.ivory.getloconow.com/default_thumb/59b5970b-23c1-4518-9e96-17ce341299fe.jpg',
'tags': ['Gameplay'],
'series': 'GTA 5',
'timestamp': 1740612872,
'modified_timestamp': 1740613037,
'upload_date': '20250226',
'modified_date': '20250226',
},
}, {
# Requires video authorization
'url': 'https://loco.com/stream/ac854641-ae0f-497c-a8ea-4195f6d8cc53',
'md5': '0513edf85c1e65c9521f555f665387d5',
'info_dict': {
'id': 'ac854641-ae0f-497c-a8ea-4195f6d8cc53',
'ext': 'mp4',
'title': 'DUAS CONTAS DESAFIANTE, RUSH TOP 1 NO BRASIL!',
'description': 'md5:aa77818edd6fe00dd4b6be75cba5f826',
'uploader_id': '7Y9JNAZC3Q',
'channel': 'ayellol',
'channel_follower_count': int,
'comment_count': int,
'view_count': int,
'concurrent_view_count': int,
'like_count': int,
'duration': 1229,
'thumbnail': 'https://static.ivory.getloconow.com/default_thumb/f5aa678b-6d04-45d9-a89a-859af0a8028f.jpg',
'tags': ['Gameplay', 'Carry'],
'series': 'League of Legends',
'timestamp': 1741182253,
'upload_date': '20250305',
'modified_timestamp': 1741182419,
'modified_date': '20250305',
},
}]
# From _app.js
_CLIENT_ID = 'TlwKp1zmF6eKFpcisn3FyR18WkhcPkZtzwPVEEC3'
_CLIENT_SECRET = 'Kp7tYlUN7LXvtcSpwYvIitgYcLparbtsQSe5AdyyCdiEJBP53Vt9J8eB4AsLdChIpcO2BM19RA3HsGtqDJFjWmwoonvMSG3ZQmnS8x1YIM8yl82xMXZGbE3NKiqmgBVU'
def _is_jwt_expired(self, token):
return jwt_decode_hs256(token)['exp'] - time.time() < 300
def _get_access_token(self, video_id):
access_token = try_call(lambda: self._get_cookies('https://loco.com')['access_token'].value)
if access_token and not self._is_jwt_expired(access_token):
return access_token
access_token = traverse_obj(self._download_json(
'https://api.getloconow.com/v3/user/device_profile/', video_id,
'Downloading access token', fatal=False, data=json.dumps({
'platform': 7,
'client_id': self._CLIENT_ID,
'client_secret': self._CLIENT_SECRET,
'model': 'Mozilla',
'os_name': 'Win32',
'os_ver': '5.0 (Windows)',
'app_ver': '5.0 (Windows)',
}).encode(), headers={
'Content-Type': 'application/json;charset=utf-8',
'DEVICE-ID': ''.join(random.choices('0123456789abcdef', k=32)) + 'live',
'X-APP-LANG': 'en',
'X-APP-LOCALE': 'en-US',
'X-CLIENT-ID': self._CLIENT_ID,
'X-CLIENT-SECRET': self._CLIENT_SECRET,
'X-PLATFORM': '7',
}), 'access_token')
if access_token and not self._is_jwt_expired(access_token):
self._set_cookie('.loco.com', 'access_token', access_token)
return access_token
def _real_extract(self, url):
video_type, video_id = self._match_valid_url(url).group('type', 'id')
webpage = self._download_webpage(url, video_id)
stream = traverse_obj(self._search_nextjs_data(webpage, video_id), (
'props', 'pageProps', ('liveStreamData', 'stream', 'liveStream'), {dict}, any, {require('stream info')}))
if access_token := self._get_access_token(video_id):
self._request_webpage(
'https://drm.loco.com/v1/streams/playback/', video_id,
'Downloading video authorization', fatal=False, headers={
'authorization': access_token,
}, query={
'stream_uid': stream['uid'],
})
return {
'formats': self._extract_m3u8_formats(stream['conf']['hls'], video_id),
'id': video_id,
'is_live': video_type == 'streamers',
**traverse_obj(stream, {
'title': ('title', {str}),
'series': ('game_name', {str}),
'uploader_id': ('user_uid', {str}),
'channel': ('alias', {str}),
'description': ('description', {str}),
'concurrent_view_count': ('viewersCurrent', {int_or_none}),
'view_count': ('total_views', {int_or_none}),
'thumbnail': ('thumbnail_url_small', {url_or_none}),
'like_count': ('likes', {int_or_none}),
'tags': ('tags', ..., {str}),
'timestamp': ('started_at', {int_or_none(scale=1000)}),
'modified_timestamp': ('updated_at', {int_or_none(scale=1000)}),
'comment_count': ('comments_count', {int_or_none}),
'channel_follower_count': ('followers_count', {int_or_none}),
'duration': ('duration', {int_or_none}),
}),
}

View File

@ -3,7 +3,9 @@
clean_html,
merge_dicts,
traverse_obj,
unified_timestamp,
url_or_none,
urljoin,
)
@ -80,7 +82,7 @@ class LRTVODIE(LRTBaseIE):
}]
def _real_extract(self, url):
path, video_id = self._match_valid_url(url).groups()
path, video_id = self._match_valid_url(url).group('path', 'id')
webpage = self._download_webpage(url, video_id)
media_url = self._extract_js_var(webpage, 'main_url', path)
@ -106,3 +108,44 @@ def _real_extract(self, url):
}
return merge_dicts(clean_info, jw_data, json_ld_data)
class LRTRadioIE(LRTBaseIE):
_VALID_URL = r'https?://(?:www\.)?lrt\.lt/radioteka/irasas/(?P<id>\d+)/(?P<path>[^?#/]+)'
_TESTS = [{
# m3u8 download
'url': 'https://www.lrt.lt/radioteka/irasas/2000359728/nemarios-eiles-apie-pragarus-ir-skaistyklas-su-aiste-kiltinaviciute',
'info_dict': {
'id': '2000359728',
'ext': 'm4a',
'title': 'Nemarios eilės: apie pragarus ir skaistyklas su Aiste Kiltinavičiūte',
'description': 'md5:5eee9a0e86a55bf547bd67596204625d',
'timestamp': 1726143120,
'upload_date': '20240912',
'tags': 'count:5',
'thumbnail': r're:https?://.+/.+\.jpe?g',
'categories': ['Daiktiniai įrodymai'],
},
}, {
'url': 'https://www.lrt.lt/radioteka/irasas/2000304654/vakaras-su-knyga-svetlana-aleksijevic-cernobylio-malda-v-dalis?season=%2Fmediateka%2Faudio%2Fvakaras-su-knyga%2F2023',
'only_matching': True,
}]
def _real_extract(self, url):
video_id, path = self._match_valid_url(url).group('id', 'path')
media = self._download_json(
'https://www.lrt.lt/radioteka/api/media', video_id,
query={'url': f'/mediateka/irasas/{video_id}/{path}'})
return {
'id': video_id,
'formats': self._extract_m3u8_formats(media['playlist_item']['file'], video_id),
**traverse_obj(media, {
'title': ('title', {str}),
'tags': ('tags', ..., 'name', {str}),
'categories': ('playlist_item', 'category', {str}, filter, all, filter),
'description': ('content', {clean_html}, {str}),
'timestamp': ('date', {lambda x: x.replace('.', '/')}, {unified_timestamp}),
'thumbnail': ('playlist_item', 'image', {urljoin('https://www.lrt.lt')}),
}),
}

View File

@ -1,31 +1,38 @@
import re
from .common import InfoExtractor
from ..utils import (
clean_html,
determine_ext,
extract_attributes,
int_or_none,
str_to_int,
join_nonempty,
parse_count,
parse_duration,
parse_iso8601,
url_or_none,
urlencode_postdata,
)
from ..utils.traversal import traverse_obj
class ManyVidsIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'(?i)https?://(?:www\.)?manyvids\.com/video/(?P<id>\d+)'
_TESTS = [{
# preview video
'url': 'https://www.manyvids.com/Video/133957/everthing-about-me/',
'md5': '03f11bb21c52dd12a05be21a5c7dcc97',
'url': 'https://www.manyvids.com/Video/530341/mv-tips-tricks',
'md5': '738dc723f7735ee9602f7ea352a6d058',
'info_dict': {
'id': '133957',
'id': '530341-preview',
'ext': 'mp4',
'title': 'everthing about me (Preview)',
'uploader': 'ellyxxix',
'title': 'MV Tips & Tricks (Preview)',
'description': r're:I will take you on a tour around .{1313}$',
'thumbnail': r're:https://cdn5\.manyvids\.com/php_uploads/video_images/DestinyDiaz/.+\.jpg',
'uploader': 'DestinyDiaz',
'view_count': int,
'like_count': int,
'release_timestamp': 1508419904,
'tags': ['AdultSchool', 'BBW', 'SFW', 'TeacherFetish'],
'release_date': '20171019',
'duration': 3167.0,
},
'expected_warnings': ['Only extracting preview'],
}, {
# full video
'url': 'https://www.manyvids.com/Video/935718/MY-FACE-REVEAL/',
@ -34,129 +41,68 @@ class ManyVidsIE(InfoExtractor):
'id': '935718',
'ext': 'mp4',
'title': 'MY FACE REVEAL',
'description': 'md5:ec5901d41808b3746fed90face161612',
'description': r're:Today is the day!! I am finally taking off my mask .{445}$',
'thumbnail': r're:https://ods\.manyvids\.com/1001061960/3aa5397f2a723ec4597e344df66ab845/screenshots/.+\.jpg',
'uploader': 'Sarah Calanthe',
'view_count': int,
'like_count': int,
'release_date': '20181110',
'tags': ['EyeContact', 'Interviews', 'MaskFetish', 'MouthFetish', 'Redhead'],
'release_timestamp': 1541851200,
'duration': 224.0,
},
}]
_API_BASE = 'https://www.manyvids.com/bff/store/video'
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json(f'{self._API_BASE}/{video_id}/private', video_id)['data']
formats, preview_only = [], True
real_url = f'https://www.manyvids.com/video/{video_id}/gtm.js'
try:
webpage = self._download_webpage(real_url, video_id)
except Exception:
# probably useless fallback
webpage = self._download_webpage(url, video_id)
info = self._search_regex(
r'''(<div\b[^>]*\bid\s*=\s*(['"])pageMetaDetails\2[^>]*>)''',
webpage, 'meta details', default='')
info = extract_attributes(info)
player = self._search_regex(
r'''(<div\b[^>]*\bid\s*=\s*(['"])rmpPlayerStream\2[^>]*>)''',
webpage, 'player details', default='')
player = extract_attributes(player)
video_urls_and_ids = (
(info.get('data-meta-video'), 'video'),
(player.get('data-video-transcoded'), 'transcoded'),
(player.get('data-video-filepath'), 'filepath'),
(self._og_search_video_url(webpage, secure=False, default=None), 'og_video'),
)
def txt_or_none(s, default=None):
return (s.strip() or default) if isinstance(s, str) else default
uploader = txt_or_none(info.get('data-meta-author'))
def mung_title(s):
if uploader:
s = re.sub(rf'^\s*{re.escape(uploader)}\s+[|-]', '', s)
return txt_or_none(s)
title = (
mung_title(info.get('data-meta-title'))
or self._html_search_regex(
(r'<span[^>]+class=["\']item-title[^>]+>([^<]+)',
r'<h2[^>]+class=["\']h2 m-0["\'][^>]*>([^<]+)'),
webpage, 'title', default=None)
or self._html_search_meta(
'twitter:title', webpage, 'title', fatal=True))
title = re.sub(r'\s*[|-]\s+ManyVids\s*$', '', title) or title
if any(p in webpage for p in ('preview_videos', '_preview.mp4')):
title += ' (Preview)'
mv_token = self._search_regex(
r'data-mvtoken=(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
'mv token', default=None, group='value')
if mv_token:
# Sets some cookies
self._download_webpage(
'https://www.manyvids.com/includes/ajax_repository/you_had_me_at_hello.php',
video_id, note='Setting format cookies', fatal=False,
data=urlencode_postdata({
'mvtoken': mv_token,
'vid': video_id,
}), headers={
'Referer': url,
'X-Requested-With': 'XMLHttpRequest',
})
formats = []
for v_url, fmt in video_urls_and_ids:
v_url = url_or_none(v_url)
if not v_url:
for format_id, path in [
('preview', ['teaser', 'filepath']),
('transcoded', ['transcodedFilepath']),
('filepath', ['filepath']),
]:
format_url = traverse_obj(video_data, (*path, {url_or_none}))
if not format_url:
continue
if determine_ext(v_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats(
v_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls'))
if determine_ext(format_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats(format_url, video_id, 'mp4', m3u8_id=format_id))
else:
formats.append({
'url': v_url,
'format_id': fmt,
'url': format_url,
'format_id': format_id,
'preference': -10 if format_id == 'preview' else None,
'quality': 10 if format_id == 'filepath' else None,
'height': int_or_none(
self._search_regex(r'_(\d{2,3}[02468])_', format_url, 'height', default=None)),
})
if format_id != 'preview':
preview_only = False
self._remove_duplicate_formats(formats)
metadata = traverse_obj(
self._download_json(f'{self._API_BASE}/{video_id}', video_id, fatal=False), 'data')
title = traverse_obj(metadata, ('title', {clean_html}))
for f in formats:
if f.get('height') is None:
f['height'] = int_or_none(
self._search_regex(r'_(\d{2,3}[02468])_', f['url'], 'video height', default=None))
if '/preview/' in f['url']:
f['format_id'] = '_'.join(filter(None, (f.get('format_id'), 'preview')))
f['preference'] = -10
if 'transcoded' in f['format_id']:
f['preference'] = f.get('preference', -1) - 1
def get_likes():
likes = self._search_regex(
rf'''(<a\b[^>]*\bdata-id\s*=\s*(['"]){video_id}\2[^>]*>)''',
webpage, 'likes', default='')
likes = extract_attributes(likes)
return int_or_none(likes.get('data-likes'))
def get_views():
return str_to_int(self._html_search_regex(
r'''(?s)<span\b[^>]*\bclass\s*=["']views-wrapper\b[^>]+>.+?<span\b[^>]+>\s*(\d[\d,.]*)\s*</span>''',
webpage, 'view count', default=None))
if preview_only:
title = join_nonempty(title, '(Preview)', delim=' ')
video_id += '-preview'
self.report_warning(
f'Only extracting preview. Video may be paid or subscription only. {self._login_hint()}')
return {
'id': video_id,
'title': title,
'formats': formats,
'description': txt_or_none(info.get('data-meta-description')),
'uploader': txt_or_none(info.get('data-meta-author')),
'thumbnail': (
url_or_none(info.get('data-meta-image'))
or url_or_none(player.get('data-video-screenshot'))),
'view_count': get_views(),
'like_count': get_likes(),
**traverse_obj(metadata, {
'description': ('description', {clean_html}),
'uploader': ('model', 'displayName', {clean_html}),
'thumbnail': (('screenshot', 'thumbnail'), {url_or_none}, any),
'view_count': ('views', {parse_count}),
'like_count': ('likes', {parse_count}),
'release_timestamp': ('launchDate', {parse_iso8601}),
'duration': ('videoDuration', {parse_duration}),
'tags': ('tagList', ..., 'label', {str}, filter, all, filter),
}),
}

View File

@ -102,11 +102,10 @@ def add_item(container, item_url, height, id_key='format_id', item_id=None):
item_id = item_id or '%dp' % height
if item_id not in item_url:
return
width = int(round(aspect_ratio * height))
container.append({
'url': item_url,
id_key: item_id,
'width': width,
'width': round(aspect_ratio * height),
'height': height,
})

View File

@ -4,6 +4,7 @@
from ..utils import (
int_or_none,
parse_iso8601,
parse_resolution,
traverse_obj,
unified_timestamp,
url_basename,
@ -83,8 +84,8 @@ def _sub_to_dict(subtitle_list):
subtitles.setdefault(sub.pop('tag', 'und'), []).append(sub)
return subtitles
def _extract_ism(self, ism_url, video_id):
formats = self._extract_ism_formats(ism_url, video_id)
def _extract_ism(self, ism_url, video_id, fatal=True):
formats = self._extract_ism_formats(ism_url, video_id, fatal=fatal)
for fmt in formats:
if fmt['language'] != 'eng' and 'English' not in fmt['format_id']:
fmt['language_preference'] = -10
@ -218,9 +219,21 @@ class MicrosoftLearnEpisodeIE(MicrosoftMediusBaseIE):
'description': 'md5:7bbbfb593d21c2cf2babc3715ade6b88',
'timestamp': 1676339547,
'upload_date': '20230214',
'thumbnail': r're:https://learn\.microsoft\.com/video/media/.*\.png',
'thumbnail': r're:https://learn\.microsoft\.com/video/media/.+\.png',
'subtitles': 'count:14',
},
}, {
'url': 'https://learn.microsoft.com/en-gb/shows/on-demand-instructor-led-training-series/az-900-module-1',
'info_dict': {
'id': '4fe10f7c-d83c-463b-ac0e-c30a8195e01b',
'ext': 'mp4',
'title': 'AZ-900 Cloud fundamentals (1 of 6)',
'description': 'md5:3c2212ce865e9142f402c766441bd5c9',
'thumbnail': r're:https://.+/.+\.jpg',
'timestamp': 1706605184,
'upload_date': '20240130',
},
'params': {'format': 'bv[protocol=https]'},
}]
def _real_extract(self, url):
@ -230,9 +243,32 @@ def _real_extract(self, url):
entry_id = self._html_search_meta('entryId', webpage, 'entryId', fatal=True)
video_info = self._download_json(
f'https://learn.microsoft.com/api/video/public/v1/entries/{entry_id}', video_id)
formats = []
if ism_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoUrl', {url_or_none})):
formats.extend(self._extract_ism(ism_url, video_id, fatal=False))
if hls_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoHLSUrl', {url_or_none})):
formats.extend(self._extract_m3u8_formats(hls_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
if mpd_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoDashUrl', {url_or_none})):
formats.extend(self._extract_mpd_formats(mpd_url, video_id, mpd_id='dash', fatal=False))
for key in ('low', 'medium', 'high'):
if video_url := traverse_obj(video_info, ('publicVideo', f'{key}QualityVideoUrl', {url_or_none})):
formats.append({
'url': video_url,
'format_id': f'video-http-{key}',
'acodec': 'none',
**parse_resolution(video_url),
})
if audio_url := traverse_obj(video_info, ('publicVideo', 'audioUrl', {url_or_none})):
formats.append({
'url': audio_url,
'format_id': 'audio-http',
'vcodec': 'none',
})
return {
'id': entry_id,
'formats': self._extract_ism(video_info['publicVideo']['adaptiveVideoUrl'], video_id),
'formats': formats,
'subtitles': self._sub_to_dict(traverse_obj(video_info, (
'publicVideo', 'captions', lambda _, v: url_or_none(v['url']), {
'tag': ('language', {str}),

View File

@ -1,5 +1,7 @@
from .telecinco import TelecincoBaseIE
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
int_or_none,
parse_iso8601,
)
@ -79,7 +81,17 @@ class MiTeleIE(TelecincoBaseIE):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
try: # yt-dlp's default user-agents are too old and blocked by akamai
webpage = self._download_webpage(url, display_id, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
})
except ExtractorError as e:
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
raise
# Retry with impersonation if hardcoded UA is insufficient to bypass akamai
webpage = self._download_webpage(url, display_id, impersonate=True)
pre_player = self._search_json(
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=',
webpage, 'Pre Player', display_id)['prePlayer']

View File

@ -10,7 +10,9 @@
parse_iso8601,
strip_or_none,
try_get,
url_or_none,
)
from ..utils.traversal import traverse_obj
class MixcloudBaseIE(InfoExtractor):
@ -37,7 +39,7 @@ class MixcloudIE(MixcloudBaseIE):
'ext': 'm4a',
'title': 'Cryptkeeper',
'description': 'After quite a long silence from myself, finally another Drum\'n\'Bass mix with my favourite current dance floor bangers.',
'uploader': 'Daniel Holbach',
'uploader': 'dholbach',
'uploader_id': 'dholbach',
'thumbnail': r're:https?://.*\.jpg',
'view_count': int,
@ -46,10 +48,11 @@ class MixcloudIE(MixcloudBaseIE):
'uploader_url': 'https://www.mixcloud.com/dholbach/',
'artist': 'Submorphics & Chino , Telekinesis, Porter Robinson, Enei, Breakage ft Jess Mills',
'duration': 3723,
'tags': [],
'tags': ['liquid drum and bass', 'drum and bass'],
'comment_count': int,
'repost_count': int,
'like_count': int,
'artists': list,
},
'params': {'skip_download': 'm3u8'},
}, {
@ -67,7 +70,7 @@ class MixcloudIE(MixcloudBaseIE):
'upload_date': '20150203',
'uploader_url': 'https://www.mixcloud.com/gillespeterson/',
'duration': 2992,
'tags': [],
'tags': ['jazz', 'soul', 'world music', 'funk'],
'comment_count': int,
'repost_count': int,
'like_count': int,
@ -149,8 +152,6 @@ def _real_extract(self, url):
elif reason:
raise ExtractorError('Track is restricted', expected=True)
title = cloudcast['name']
stream_info = cloudcast['streamInfo']
formats = []
@ -182,47 +183,39 @@ def _real_extract(self, url):
self.raise_login_required(metadata_available=True)
comments = []
for edge in (try_get(cloudcast, lambda x: x['comments']['edges']) or []):
node = edge.get('node') or {}
for node in traverse_obj(cloudcast, ('comments', 'edges', ..., 'node', {dict})):
text = strip_or_none(node.get('comment'))
if not text:
continue
user = node.get('user') or {}
comments.append({
'author': user.get('displayName'),
'author_id': user.get('username'),
'text': text,
'timestamp': parse_iso8601(node.get('created')),
**traverse_obj(node, {
'author': ('user', 'displayName', {str}),
'author_id': ('user', 'username', {str}),
'timestamp': ('created', {parse_iso8601}),
}),
})
tags = []
for t in cloudcast.get('tags'):
tag = try_get(t, lambda x: x['tag']['name'], str)
if not tag:
tags.append(tag)
get_count = lambda x: int_or_none(try_get(cloudcast, lambda y: y[x]['totalCount']))
owner = cloudcast.get('owner') or {}
return {
'id': track_id,
'title': title,
'formats': formats,
'description': cloudcast.get('description'),
'thumbnail': try_get(cloudcast, lambda x: x['picture']['url'], str),
'uploader': owner.get('displayName'),
'timestamp': parse_iso8601(cloudcast.get('publishDate')),
'uploader_id': owner.get('username'),
'uploader_url': owner.get('url'),
'duration': int_or_none(cloudcast.get('audioLength')),
'view_count': int_or_none(cloudcast.get('plays')),
'like_count': get_count('favorites'),
'repost_count': get_count('reposts'),
'comment_count': get_count('comments'),
'comments': comments,
'tags': tags,
'artist': ', '.join(cloudcast.get('featuringArtistList') or []) or None,
**traverse_obj(cloudcast, {
'title': ('name', {str}),
'description': ('description', {str}),
'thumbnail': ('picture', 'url', {url_or_none}),
'timestamp': ('publishDate', {parse_iso8601}),
'duration': ('audioLength', {int_or_none}),
'uploader': ('owner', 'displayName', {str}),
'uploader_id': ('owner', 'username', {str}),
'uploader_url': ('owner', 'url', {url_or_none}),
'view_count': ('plays', {int_or_none}),
'like_count': ('favorites', 'totalCount', {int_or_none}),
'repost_count': ('reposts', 'totalCount', {int_or_none}),
'comment_count': ('comments', 'totalCount', {int_or_none}),
'tags': ('tags', ..., 'tag', 'name', {str}, filter, all, filter),
'artists': ('featuringArtistList', ..., {str}, filter, all, filter),
}),
}
@ -295,7 +288,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
'url': 'http://www.mixcloud.com/dholbach/',
'info_dict': {
'id': 'dholbach_uploads',
'title': 'Daniel Holbach (uploads)',
'title': 'dholbach (uploads)',
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
},
'playlist_mincount': 36,
@ -303,7 +296,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
'url': 'http://www.mixcloud.com/dholbach/uploads/',
'info_dict': {
'id': 'dholbach_uploads',
'title': 'Daniel Holbach (uploads)',
'title': 'dholbach (uploads)',
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
},
'playlist_mincount': 36,
@ -311,7 +304,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
'url': 'http://www.mixcloud.com/dholbach/favorites/',
'info_dict': {
'id': 'dholbach_favorites',
'title': 'Daniel Holbach (favorites)',
'title': 'dholbach (favorites)',
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
},
# 'params': {
@ -337,7 +330,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
'title': 'First Ear (stream)',
'description': 'we maraud for ears',
},
'playlist_mincount': 269,
'playlist_mincount': 267,
}]
_TITLE_KEY = 'displayName'
@ -361,7 +354,7 @@ class MixcloudPlaylistIE(MixcloudPlaylistBaseIE):
'id': 'maxvibes_jazzcat-on-ness-radio',
'title': 'Ness Radio sessions',
},
'playlist_mincount': 59,
'playlist_mincount': 58,
}]
_TITLE_KEY = 'name'
_DESCRIPTION_KEY = 'description'

View File

@ -365,13 +365,15 @@ def _real_initialize(self):
'All videos are only available to registered users', method='password')
def _set_device_id(self, username):
if not self._device_id:
self._device_id = self.cache.load(
self._NETRC_MACHINE, 'device_ids', default={}).get(username)
if self._device_id:
return
device_id_cache = self.cache.load(self._NETRC_MACHINE, 'device_ids', default={})
self._device_id = device_id_cache.get(username)
if self._device_id:
return
self._device_id = str(uuid.uuid4())
self.cache.store(self._NETRC_MACHINE, 'device_ids', {username: self._device_id})
device_id_cache[username] = self._device_id
self.cache.store(self._NETRC_MACHINE, 'device_ids', device_id_cache)
def _perform_login(self, username, password):
try:
@ -449,9 +451,7 @@ def _extract_formats_and_subtitles(self, broadcast, video_id):
if not (m3u8_url and token):
errors = '; '.join(traverse_obj(response, ('errors', ..., 'message', {str})))
if 'not entitled' in errors:
raise ExtractorError(errors, expected=True)
elif errors: # Only warn when 'blacked out' since radio formats are available
if errors: # Only warn when 'blacked out' or 'not entitled'; radio formats may be available
self.report_warning(f'API returned errors for {format_id}: {errors}')
else:
self.report_warning(f'No formats available for {format_id} broadcast; skipping')

View File

@ -3,8 +3,8 @@
class MoviepilotIE(InfoExtractor):
_IE_NAME = 'moviepilot'
_IE_DESC = 'Moviepilot trailer'
IE_NAME = 'moviepilot'
IE_DESC = 'Moviepilot trailer'
_VALID_URL = r'https?://(?:www\.)?moviepilot\.de/movies/(?P<id>[^/]+)'
_TESTS = [{

View File

@ -19,7 +19,8 @@
class NBACVPBaseIE(TurnerBaseIE):
def _extract_nba_cvp_info(self, path, video_id, fatal=False):
return self._extract_cvp_info(
f'http://secure.nba.com/{path}', video_id, {
# XXX: The 3rd argument (None) needs to be the AdobePass software_statement
f'http://secure.nba.com/{path}', video_id, None, {
'default': {
'media_src': 'http://nba.cdn.turner.com/nba/big',
},
@ -94,6 +95,7 @@ def _extract_video(self, filter_key, filter_value):
class NBAWatchEmbedIE(NBAWatchBaseIE):
_WORKING = False
IE_NAME = 'nba:watch:embed'
_VALID_URL = NBAWatchBaseIE._VALID_URL_BASE + r'embed\?.*?\bid=(?P<id>\d+)'
_TESTS = [{
@ -115,6 +117,7 @@ def _real_extract(self, url):
class NBAWatchIE(NBAWatchBaseIE):
_WORKING = False
IE_NAME = 'nba:watch'
_VALID_URL = NBAWatchBaseIE._VALID_URL_BASE + r'(?:nba/)?video/(?P<id>.+?(?=/index\.html)|(?:[^/]+/)*[^/?#&]+)'
_TESTS = [{
@ -167,6 +170,7 @@ def _real_extract(self, url):
class NBAWatchCollectionIE(NBAWatchBaseIE):
_WORKING = False
IE_NAME = 'nba:watch:collection'
_VALID_URL = NBAWatchBaseIE._VALID_URL_BASE + r'list/collection/(?P<id>[^/?#&]+)'
_TESTS = [{
@ -336,6 +340,7 @@ def _real_extract(self, url):
class NBAEmbedIE(NBABaseIE):
_WORKING = False
IE_NAME = 'nba:embed'
_VALID_URL = r'https?://secure\.nba\.com/assets/amp/include/video/(?:topI|i)frame\.html\?.*?\bcontentId=(?P<id>[^?#&]+)'
_TESTS = [{
@ -358,6 +363,7 @@ def _real_extract(self, url):
class NBAIE(NBABaseIE):
_WORKING = False
IE_NAME = 'nba'
_VALID_URL = NBABaseIE._VALID_URL_BASE + f'(?!{NBABaseIE._CHANNEL_PATH_REGEX})video/(?P<id>(?:[^/]+/)*[^/?#&]+)'
_TESTS = [{
@ -385,6 +391,7 @@ def _extract_url_results(self, team, content_id):
class NBAChannelIE(NBABaseIE):
_WORKING = False
IE_NAME = 'nba:channel'
_VALID_URL = NBABaseIE._VALID_URL_BASE + f'(?:{NBABaseIE._CHANNEL_PATH_REGEX})/(?P<id>[^/?#&]+)'
_TESTS = [{

View File

@ -6,7 +6,7 @@
from .adobepass import AdobePassIE
from .common import InfoExtractor
from .theplatform import ThePlatformIE, default_ns
from .theplatform import ThePlatformBaseIE, ThePlatformIE, default_ns
from ..networking import HEADRequest
from ..utils import (
ExtractorError,
@ -14,26 +14,130 @@
UserNotLive,
clean_html,
determine_ext,
extract_attributes,
float_or_none,
get_element_html_by_class,
int_or_none,
join_nonempty,
make_archive_id,
mimetype2ext,
parse_age_limit,
parse_duration,
parse_iso8601,
remove_end,
smuggle_url,
traverse_obj,
try_get,
unescapeHTML,
unified_timestamp,
update_url_query,
url_basename,
url_or_none,
)
from ..utils.traversal import require, traverse_obj
class NBCIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
_VALID_URL = r'https?(?P<permalink>://(?:www\.)?nbc\.com/(?:classic-tv/)?[^/]+/video/[^/]+/(?P<id>(?:NBCE|n)?\d+))'
class NBCUniversalBaseIE(ThePlatformBaseIE):
_GEO_COUNTRIES = ['US']
_GEO_BYPASS = False
_M3U8_RE = r'https?://[^/?#]+/prod/[\w-]+/(?P<folders>[^?#]+/)cmaf/mpeg_(?:cbcs|cenc)\w*/master_cmaf\w*\.m3u8'
def _download_nbcu_smil_and_extract_m3u8_url(self, tp_path, video_id, query):
smil = self._download_xml(
f'https://link.theplatform.com/s/{tp_path}', video_id,
'Downloading SMIL manifest', 'Failed to download SMIL manifest', query={
**query,
'format': 'SMIL', # XXX: Do not confuse "format" with "formats"
'manifest': 'm3u',
'switch': 'HLSServiceSecure', # Or else we get broken mp4 http URLs instead of HLS
}, headers=self.geo_verification_headers())
ns = f'//{{{default_ns}}}'
if url := traverse_obj(smil, (f'{ns}video/@src', lambda _, v: determine_ext(v) == 'm3u8', any)):
return url
exc = traverse_obj(smil, (f'{ns}param', lambda _, v: v.get('name') == 'exception', '@value', any))
if exc == 'GeoLocationBlocked':
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
raise ExtractorError(traverse_obj(smil, (f'{ns}ref/@abstract', ..., any)), expected=exc == 'Expired')
def _extract_nbcu_formats_and_subtitles(self, tp_path, video_id, query):
# formats='mpeg4' will return either a working m3u8 URL or an m3u8 template for non-DRM HLS
# formats='m3u+none,mpeg4' may return DRM HLS but w/the "folders" needed for non-DRM template
query['formats'] = 'm3u+none,mpeg4'
m3u8_url = self._download_nbcu_smil_and_extract_m3u8_url(tp_path, video_id, query)
if mobj := re.fullmatch(self._M3U8_RE, m3u8_url):
query['formats'] = 'mpeg4'
m3u8_tmpl = self._download_nbcu_smil_and_extract_m3u8_url(tp_path, video_id, query)
# Example: https://vod-lf-oneapp-prd.akamaized.net/prod/video/{folders}master_hls.m3u8
if '{folders}' in m3u8_tmpl:
self.write_debug('Found m3u8 URL template, formatting URL path')
m3u8_url = m3u8_tmpl.format(folders=mobj.group('folders'))
if '/mpeg_cenc' in m3u8_url or '/mpeg_cbcs' in m3u8_url:
self.report_drm(video_id)
return self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, 'mp4', m3u8_id='hls')
def _extract_nbcu_video(self, url, display_id, old_ie_key=None):
webpage = self._download_webpage(url, display_id)
settings = self._search_json(
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>',
webpage, 'settings', display_id)
query = {}
tve = extract_attributes(get_element_html_by_class('tve-video-deck-app', webpage) or '')
if tve:
account_pid = tve.get('data-mpx-media-account-pid') or tve['data-mpx-account-pid']
account_id = tve['data-mpx-media-account-id']
metadata = self._parse_json(
tve.get('data-normalized-video') or '', display_id, fatal=False, transform_source=unescapeHTML)
video_id = tve.get('data-guid') or metadata['guid']
if tve.get('data-entitlement') == 'auth':
auth = settings['tve_adobe_auth']
release_pid = tve['data-release-pid']
resource = self._get_mvpd_resource(
tve.get('data-adobe-pass-resource-id') or auth['adobePassResourceId'],
tve['data-title'], release_pid, tve.get('data-rating'))
query['auth'] = self._extract_mvpd_auth(
url, release_pid, auth['adobePassRequestorId'],
resource, auth['adobePassSoftwareStatement'])
else:
ls_playlist = traverse_obj(settings, (
'ls_playlist', lambda _, v: v['defaultGuid'], any, {require('LS playlist')}))
video_id = ls_playlist['defaultGuid']
account_pid = ls_playlist.get('mpxMediaAccountPid') or ls_playlist['mpxAccountPid']
account_id = ls_playlist['mpxMediaAccountId']
metadata = traverse_obj(ls_playlist, ('videos', lambda _, v: v['guid'] == video_id, any)) or {}
tp_path = f'{account_pid}/media/guid/{account_id}/{video_id}'
formats, subtitles = self._extract_nbcu_formats_and_subtitles(tp_path, video_id, query)
tp_metadata = self._download_theplatform_metadata(tp_path, video_id, fatal=False)
parsed_info = self._parse_theplatform_metadata(tp_metadata)
self._merge_subtitles(parsed_info['subtitles'], target=subtitles)
return {
**parsed_info,
**traverse_obj(metadata, {
'title': ('title', {str}),
'description': ('description', {str}),
'duration': ('durationInSeconds', {int_or_none}),
'timestamp': ('airDate', {parse_iso8601}),
'thumbnail': ('thumbnailUrl', {url_or_none}),
'season_number': ('seasonNumber', {int_or_none}),
'episode_number': ('episodeNumber', {int_or_none}),
'episode': ('episodeTitle', {str}),
'series': ('show', {str}),
}),
'id': video_id,
'display_id': display_id,
'formats': formats,
'subtitles': subtitles,
'_old_archive_ids': [make_archive_id(old_ie_key, video_id)] if old_ie_key else None,
}
class NBCIE(NBCUniversalBaseIE):
_VALID_URL = r'https?(?P<permalink>://(?:www\.)?nbc\.com/(?:classic-tv/)?[^/?#]+/video/[^/?#]+/(?P<id>\w+))'
_TESTS = [
{
'url': 'http://www.nbc.com/the-tonight-show/video/jimmy-fallon-surprises-fans-at-ben-jerrys/2848237',
@ -49,47 +153,20 @@ class NBCIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
'episode_number': 86,
'season': 'Season 2',
'season_number': 2,
'series': 'Tonight Show: Jimmy Fallon',
'duration': 237.0,
'chapters': 'count:1',
'tags': 'count:4',
'series': 'Tonight',
'duration': 236.504,
'tags': 'count:2',
'thumbnail': r're:https?://.+\.jpg',
'categories': ['Series/The Tonight Show Starring Jimmy Fallon'],
'media_type': 'Full Episode',
'age_limit': 14,
'_old_archive_ids': ['theplatform 2848237'],
},
'params': {
'skip_download': 'm3u8',
},
},
{
'url': 'http://www.nbc.com/saturday-night-live/video/star-wars-teaser/2832821',
'info_dict': {
'id': '2832821',
'ext': 'mp4',
'title': 'Star Wars Teaser',
'description': 'md5:0b40f9cbde5b671a7ff62fceccc4f442',
'timestamp': 1417852800,
'upload_date': '20141206',
'uploader': 'NBCU-COM',
},
'skip': 'page not found',
},
{
# HLS streams requires the 'hdnea3' cookie
'url': 'http://www.nbc.com/Kings/video/goliath/n1806',
'info_dict': {
'id': '101528f5a9e8127b107e98c5e6ce4638',
'ext': 'mp4',
'title': 'Goliath',
'description': 'When an unknown soldier saves the life of the King\'s son in battle, he\'s thrust into the limelight and politics of the kingdom.',
'timestamp': 1237100400,
'upload_date': '20090315',
'uploader': 'NBCU-COM',
},
'skip': 'page not found',
},
{
# manifest url does not have extension
'url': 'https://www.nbc.com/the-golden-globe-awards/video/oprah-winfrey-receives-cecil-b-de-mille-award-at-the-2018-golden-globes/3646439',
'info_dict': {
'id': '3646439',
@ -99,48 +176,47 @@ class NBCIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
'episode_number': 1,
'season': 'Season 75',
'season_number': 75,
'series': 'The Golden Globe Awards',
'series': 'Golden Globes',
'description': 'Oprah Winfrey receives the Cecil B. de Mille Award at the 75th Annual Golden Globe Awards.',
'uploader': 'NBCU-COM',
'upload_date': '20180107',
'timestamp': 1515312000,
'duration': 570.0,
'duration': 569.703,
'tags': 'count:8',
'thumbnail': r're:https?://.+\.jpg',
'chapters': 'count:1',
'media_type': 'Highlight',
'age_limit': 0,
'categories': ['Series/The Golden Globe Awards'],
'_old_archive_ids': ['theplatform 3646439'],
},
'params': {
'skip_download': 'm3u8',
},
},
{
# new video_id format
'url': 'https://www.nbc.com/quantum-leap/video/bens-first-leap-nbcs-quantum-leap/NBCE125189978',
# Needs to be extracted from webpage instead of GraphQL
'url': 'https://www.nbc.com/paris2024/video/ali-truwit-found-purpose-pool-after-her-life-changed/para24_sww_alitruwittodayshow_240823',
'info_dict': {
'id': 'NBCE125189978',
'id': 'para24_sww_alitruwittodayshow_240823',
'ext': 'mp4',
'title': 'Ben\'s First Leap | NBC\'s Quantum Leap',
'description': 'md5:a82762449b7ec4bb83291a7b355ebf8e',
'uploader': 'NBCU-COM',
'series': 'Quantum Leap',
'season': 'Season 1',
'season_number': 1,
'episode': 'Ben\'s First Leap | NBC\'s Quantum Leap',
'episode_number': 1,
'duration': 170.171,
'chapters': [],
'timestamp': 1663956155,
'upload_date': '20220923',
'tags': 'count:10',
'age_limit': 0,
'title': 'Ali Truwit found purpose in the pool after her life changed',
'description': 'md5:c16d7489e1516593de1cc5d3f39b9bdb',
'uploader': 'NBCU-SPORTS',
'duration': 311.077,
'thumbnail': r're:https?://.+\.jpg',
'categories': ['Series/Quantum Leap 2022'],
'media_type': 'Highlight',
'episode': 'Ali Truwit found purpose in the pool after her life changed',
'timestamp': 1724435902.0,
'upload_date': '20240823',
'_old_archive_ids': ['theplatform para24_sww_alitruwittodayshow_240823'],
},
'params': {
'skip_download': 'm3u8',
},
},
{
'url': 'https://www.nbc.com/quantum-leap/video/bens-first-leap-nbcs-quantum-leap/NBCE125189978',
'only_matching': True,
},
{
'url': 'https://www.nbc.com/classic-tv/charles-in-charge/video/charles-in-charge-pilot/n3310',
'only_matching': True,
@ -151,6 +227,7 @@ class NBCIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
'only_matching': True,
},
]
_SOFTWARE_STATEMENT = 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiI1Yzg2YjdkYy04NDI3LTRjNDUtOGQwZi1iNDkzYmE3MmQwYjQiLCJuYmYiOjE1Nzg3MDM2MzEsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTc4NzAzNjMxfQ.QQKIsBhAjGQTMdAqRTqhcz2Cddr4Y2hEjnSiOeKKki4nLrkDOsjQMmqeTR0hSRarraxH54wBgLvsxI7LHwKMvr7G8QpynNAxylHlQD3yhN9tFhxt4KR5wW3as02B-W2TznK9bhNWPKIyHND95Uo2Mi6rEQoq8tM9O09WPWaanE5BX_-r6Llr6dPq5F0Lpx2QOn2xYRb1T4nFxdFTNoss8GBds8OvChTiKpXMLHegLTc1OS4H_1a8tO_37jDwSdJuZ8iTyRLV4kZ2cpL6OL5JPMObD4-HQiec_dfcYgMKPiIfP9ZqdXpec2SVaCLsWEk86ZYvD97hLIQrK5rrKd1y-A'
def _real_extract(self, url):
permalink, video_id = self._match_valid_url(url).groups()
@ -196,62 +273,50 @@ def _real_extract(self, url):
'userId': '0',
}),
})['data']['bonanzaPage']['metadata']
query = {
'mbr': 'true',
'manifest': 'm3u',
'switch': 'HLSServiceSecure',
}
if not video_data:
# Some videos are not available via GraphQL API
webpage = self._download_webpage(url, video_id)
video_data = self._search_json(
r'<script>\s*PRELOAD\s*=', webpage, 'video data',
video_id)['pages'][urllib.parse.urlparse(url).path]['base']['metadata']
video_id = video_data['mpxGuid']
tp_path = 'NnzsPC/media/guid/{}/{}'.format(video_data.get('mpxAccountId') or '2410887629', video_id)
tpm = self._download_theplatform_metadata(tp_path, video_id)
title = tpm.get('title') or video_data.get('secondaryTitle')
tp_path = f'NnzsPC/media/guid/{video_data["mpxAccountId"]}/{video_id}'
tpm = self._download_theplatform_metadata(tp_path, video_id, fatal=False)
title = traverse_obj(tpm, ('title', {str})) or video_data.get('secondaryTitle')
query = {}
if video_data.get('locked'):
resource = self._get_mvpd_resource(
video_data.get('resourceId') or 'nbcentertainment',
title, video_id, video_data.get('rating'))
video_data['resourceId'], title, video_id, video_data.get('rating'))
query['auth'] = self._extract_mvpd_auth(
url, video_id, 'nbcentertainment', resource)
theplatform_url = smuggle_url(update_url_query(
'http://link.theplatform.com/s/NnzsPC/media/guid/{}/{}'.format(video_data.get('mpxAccountId') or '2410887629', video_id),
query), {'force_smil_url': True})
url, video_id, 'nbcentertainment', resource, self._SOFTWARE_STATEMENT)
# Empty string or 0 can be valid values for these. So the check must be `is None`
description = video_data.get('description')
if description is None:
description = tpm.get('description')
episode_number = int_or_none(video_data.get('episodeNumber'))
if episode_number is None:
episode_number = int_or_none(tpm.get('nbcu$airOrder'))
rating = video_data.get('rating')
if rating is None:
try_get(tpm, lambda x: x['ratings'][0]['rating'])
season_number = int_or_none(video_data.get('seasonNumber'))
if season_number is None:
season_number = int_or_none(tpm.get('nbcu$seasonNumber'))
series = video_data.get('seriesShortTitle')
if series is None:
series = tpm.get('nbcu$seriesShortTitle')
tags = video_data.get('keywords')
if tags is None or len(tags) == 0:
tags = tpm.get('keywords')
formats, subtitles = self._extract_nbcu_formats_and_subtitles(tp_path, video_id, query)
parsed_info = self._parse_theplatform_metadata(tpm)
self._merge_subtitles(parsed_info['subtitles'], target=subtitles)
return {
'_type': 'url_transparent',
'age_limit': parse_age_limit(rating),
'description': description,
'episode': title,
'episode_number': episode_number,
**traverse_obj(video_data, {
'description': ('description', {str}, filter),
'episode': ('secondaryTitle', {str}, filter),
'episode_number': ('episodeNumber', {int_or_none}),
'season_number': ('seasonNumber', {int_or_none}),
'age_limit': ('rating', {parse_age_limit}),
'tags': ('keywords', ..., {str}, filter, all, filter),
'series': ('seriesShortTitle', {str}),
}),
**parsed_info,
'id': video_id,
'ie_key': 'ThePlatform',
'season_number': season_number,
'series': series,
'tags': tags,
'title': title,
'url': theplatform_url,
'formats': formats,
'subtitles': subtitles,
'_old_archive_ids': [make_archive_id('ThePlatform', video_id)],
}
class NBCSportsVPlayerIE(InfoExtractor):
_WORKING = False
_VALID_URL_BASE = r'https?://(?:vplayer\.nbcsports\.com|(?:www\.)?nbcsports\.com/vplayer)/'
_VALID_URL = _VALID_URL_BASE + r'(?:[^/]+/)+(?P<id>[0-9a-zA-Z_]+)'
_EMBED_REGEX = [rf'(?:iframe[^>]+|var video|div[^>]+data-(?:mpx-)?)[sS]rc\s?=\s?"(?P<url>{_VALID_URL_BASE}[^\"]+)']
@ -286,6 +351,7 @@ def _real_extract(self, url):
class NBCSportsIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:www\.)?nbcsports\.com//?(?!vplayer/)(?:[^/]+/)+(?P<id>[0-9a-z-]+)'
_TESTS = [{
@ -321,6 +387,7 @@ def _real_extract(self, url):
class NBCSportsStreamIE(AdobePassIE):
_WORKING = False
_VALID_URL = r'https?://stream\.nbcsports\.com/.+?\bpid=(?P<id>\d+)'
_TEST = {
'url': 'http://stream.nbcsports.com/nbcsn/generic?pid=206559',
@ -354,7 +421,7 @@ def _real_extract(self, url):
source_url = video_source['ottStreamUrl']
is_live = video_source.get('type') == 'live' or video_source.get('status') == 'Live'
resource = self._get_mvpd_resource('nbcsports', title, video_id, '')
token = self._extract_mvpd_auth(url, video_id, 'nbcsports', resource)
token = self._extract_mvpd_auth(url, video_id, 'nbcsports', resource, None) # XXX: None arg needs to be software_statement
tokenized_url = self._download_json(
'https://token.playmakerservices.com/cdn',
video_id, data=json.dumps({
@ -534,22 +601,26 @@ class NBCOlympicsIE(InfoExtractor):
IE_NAME = 'nbcolympics'
_VALID_URL = r'https?://www\.nbcolympics\.com/videos?/(?P<id>[0-9a-z-]+)'
_TEST = {
_TESTS = [{
# Geo-restricted to US
'url': 'http://www.nbcolympics.com/video/justin-roses-son-leo-was-tears-after-his-dad-won-gold',
'md5': '54fecf846d05429fbaa18af557ee523a',
'url': 'https://www.nbcolympics.com/videos/watch-final-minutes-team-usas-mens-basketball-gold',
'info_dict': {
'id': 'WjTBzDXx5AUq',
'display_id': 'justin-roses-son-leo-was-tears-after-his-dad-won-gold',
'id': 'SAwGfPlQ1q01',
'ext': 'mp4',
'title': 'Rose\'s son Leo was in tears after his dad won gold',
'description': 'Olympic gold medalist Justin Rose gets emotional talking to the impact his win in men\'s golf has already had on his children.',
'timestamp': 1471274964,
'upload_date': '20160815',
'display_id': 'watch-final-minutes-team-usas-mens-basketball-gold',
'title': 'Watch the final minutes of Team USA\'s men\'s basketball gold',
'description': 'md5:f704f591217305c9559b23b877aa8d31',
'uploader': 'NBCU-SPORTS',
'duration': 387.053,
'thumbnail': r're:https://.+/.+\.jpg',
'chapters': [],
'timestamp': 1723346984,
'upload_date': '20240811',
},
'skip': '404 Not Found',
}
}, {
'url': 'http://www.nbcolympics.com/video/justin-roses-son-leo-was-tears-after-his-dad-won-gold',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
@ -578,6 +649,7 @@ def _real_extract(self, url):
class NBCOlympicsStreamIE(AdobePassIE):
_WORKING = False
IE_NAME = 'nbcolympics:stream'
_VALID_URL = r'https?://stream\.nbcolympics\.com/(?P<id>[0-9a-z-]+)'
_TESTS = [
@ -630,7 +702,8 @@ def _real_extract(self, url):
event_config.get('resourceId', 'NBCOlympics'),
re.sub(r'[^\w\d ]+', '', event_config['eventTitle']), pid,
event_config.get('ratingId', 'NO VALUE'))
media_token = self._extract_mvpd_auth(url, pid, event_config.get('requestorId', 'NBCOlympics'), ap_resource)
# XXX: The None arg below needs to be the software_statement for this requestor
media_token = self._extract_mvpd_auth(url, pid, event_config.get('requestorId', 'NBCOlympics'), ap_resource, None)
source_url = self._download_json(
'https://tokens.playmakerservices.com/', pid, 'Retrieving tokenized URL',
@ -848,3 +921,178 @@ def _real_extract(self, url):
'is_live': is_live,
**info,
}
class BravoTVIE(NBCUniversalBaseIE):
_VALID_URL = r'https?://(?:www\.)?(?:bravotv|oxygen)\.com/(?:[^/?#]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
'info_dict': {
'id': '3923059',
'ext': 'mp4',
'title': 'The Top Chef Season 16 Winner Is...',
'display_id': 'the-top-chef-season-16-winner-is',
'description': 'Find out who takes the title of Top Chef!',
'upload_date': '20190315',
'timestamp': 1552618860,
'season_number': 16,
'episode_number': 15,
'series': 'Top Chef',
'episode': 'Finale',
'duration': 190,
'season': 'Season 16',
'thumbnail': r're:^https://.+\.jpg',
'uploader': 'NBCU-BRAV',
'categories': ['Series', 'Series/Top Chef'],
'tags': 'count:10',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.bravotv.com/top-chef/season-20/episode-1/london-calling',
'info_dict': {
'id': '9000234570',
'ext': 'mp4',
'title': 'London Calling',
'display_id': 'london-calling',
'description': 'md5:5af95a8cbac1856bd10e7562f86bb759',
'upload_date': '20230310',
'timestamp': 1678418100,
'season_number': 20,
'episode_number': 1,
'series': 'Top Chef',
'episode': 'London Calling',
'duration': 3266,
'season': 'Season 20',
'chapters': 'count:7',
'thumbnail': r're:^https://.+\.jpg',
'age_limit': 14,
'media_type': 'Full Episode',
'uploader': 'NBCU-MPAT',
'categories': ['Series/Top Chef'],
'tags': 'count:10',
},
'params': {'skip_download': 'm3u8'},
'skip': 'This video requires AdobePass MSO credentials',
}, {
'url': 'https://www.oxygen.com/in-ice-cold-blood/season-1/closing-night',
'info_dict': {
'id': '3692045',
'ext': 'mp4',
'title': 'Closing Night',
'display_id': 'closing-night',
'description': 'md5:c8a5bb523c8ef381f3328c6d9f1e4632',
'upload_date': '20230126',
'timestamp': 1674709200,
'season_number': 1,
'episode_number': 1,
'series': 'In Ice Cold Blood',
'episode': 'Closing Night',
'duration': 2629,
'season': 'Season 1',
'chapters': 'count:6',
'thumbnail': r're:^https://.+\.jpg',
'age_limit': 14,
'media_type': 'Full Episode',
'uploader': 'NBCU-MPAT',
'categories': ['Series/In Ice Cold Blood'],
'tags': ['ice-t', 'in ice cold blood', 'law and order', 'oxygen', 'true crime'],
},
'params': {'skip_download': 'm3u8'},
'skip': 'This video requires AdobePass MSO credentials',
}, {
'url': 'https://www.oxygen.com/in-ice-cold-blood/season-2/episode-16/videos/handling-the-horwitz-house-after-the-murder-season-2',
'info_dict': {
'id': '3974019',
'ext': 'mp4',
'title': '\'Handling The Horwitz House After The Murder (Season 2, Episode 16)',
'display_id': 'handling-the-horwitz-house-after-the-murder-season-2',
'description': 'md5:f9d638dd6946a1c1c0533a9c6100eae5',
'upload_date': '20190618',
'timestamp': 1560819600,
'season_number': 2,
'episode_number': 16,
'series': 'In Ice Cold Blood',
'episode': 'Mother Vs Son',
'duration': 68,
'season': 'Season 2',
'thumbnail': r're:^https://.+\.jpg',
'age_limit': 14,
'uploader': 'NBCU-OXY',
'categories': ['Series/In Ice Cold Blood'],
'tags': ['in ice cold blood', 'ice-t', 'law and order', 'true crime', 'oxygen'],
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
return self._extract_nbcu_video(url, display_id)
class SyfyIE(NBCUniversalBaseIE):
_VALID_URL = r'https?://(?:www\.)?syfy\.com/[^/?#]+/(?:season-\d+/episode-\d+/(?:videos/)?|videos/)(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.syfy.com/face-off/season-13/episode-10/videos/keyed-up',
'info_dict': {
'id': '3774403',
'ext': 'mp4',
'display_id': 'keyed-up',
'title': 'Keyed Up',
'description': 'md5:feafd15bee449f212dcd3065bbe9a755',
'age_limit': 14,
'duration': 169,
'thumbnail': r're:https://www\.syfy\.com/.+/.+\.jpg',
'series': 'Face Off',
'season': 'Season 13',
'season_number': 13,
'episode': 'Through the Looking Glass Part 2',
'episode_number': 10,
'timestamp': 1533711618,
'upload_date': '20180808',
'media_type': 'Excerpt',
'uploader': 'NBCU-MPAT',
'categories': ['Series/Face Off'],
'tags': 'count:15',
'_old_archive_ids': ['theplatform 3774403'],
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.syfy.com/face-off/season-13/episode-10/through-the-looking-glass-part-2',
'info_dict': {
'id': '3772391',
'ext': 'mp4',
'display_id': 'through-the-looking-glass-part-2',
'title': 'Through the Looking Glass Pt.2',
'description': 'md5:90bd5dcbf1059fe3296c263599af41d2',
'age_limit': 0,
'duration': 2599,
'thumbnail': r're:https://www\.syfy\.com/.+/.+\.jpg',
'chapters': [{'start_time': 0.0, 'end_time': 679.0, 'title': '<Untitled Chapter 1>'},
{'start_time': 679.0, 'end_time': 1040.967, 'title': '<Untitled Chapter 2>'},
{'start_time': 1040.967, 'end_time': 1403.0, 'title': '<Untitled Chapter 3>'},
{'start_time': 1403.0, 'end_time': 1870.0, 'title': '<Untitled Chapter 4>'},
{'start_time': 1870.0, 'end_time': 2496.967, 'title': '<Untitled Chapter 5>'},
{'start_time': 2496.967, 'end_time': 2599, 'title': '<Untitled Chapter 6>'}],
'series': 'Face Off',
'season': 'Season 13',
'season_number': 13,
'episode': 'Through the Looking Glass Part 2',
'episode_number': 10,
'timestamp': 1672570800,
'upload_date': '20230101',
'media_type': 'Full Episode',
'uploader': 'NBCU-MPAT',
'categories': ['Series/Face Off'],
'tags': 'count:15',
'_old_archive_ids': ['theplatform 3772391'],
},
'params': {'skip_download': 'm3u8'},
'skip': 'This video requires AdobePass MSO credentials',
}]
def _real_extract(self, url):
display_id = self._match_id(url)
return self._extract_nbcu_video(url, display_id, old_ie_key='ThePlatform')

View File

@ -3,6 +3,7 @@
from .art19 import Art19IE
from .common import InfoExtractor
from ..networking import PATCHRequest
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
@ -74,7 +75,7 @@ def _extract_formats(self, content_id, slug):
'app_version': '23.10.0',
'platform': 'ios',
})
return {'formats': fmts, 'subtitles': subs}
break
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 401:
self.raise_login_required()
@ -84,6 +85,9 @@ def _extract_formats(self, content_id, slug):
continue
raise
self.mark_watched(content_id, slug)
return {'formats': fmts, 'subtitles': subs}
def _extract_video_metadata(self, episode):
channel_url = traverse_obj(
episode, (('channel_slug', 'class_slug'), {urljoin('https://nebula.tv/')}), get_all=False)
@ -111,6 +115,13 @@ def _extract_video_metadata(self, episode):
'uploader_url': channel_url,
}
def _mark_watched(self, content_id, slug):
self._call_api(
PATCHRequest(f'https://content.api.nebula.app/{content_id.split(":")[0]}s/{content_id}/progress/'),
slug, 'Marking watched', 'Unable to mark watched', fatal=False,
data=json.dumps({'completed': True}).encode(),
headers={'content-type': 'application/json'})
class NebulaIE(NebulaBaseIE):
IE_NAME = 'nebula:video'
@ -322,6 +333,7 @@ def _real_extract(self, url):
if not episode_url and metadata.get('premium'):
self.raise_login_required()
self.mark_watched(metadata['id'], slug)
if Art19IE.suitable(episode_url):
return self.url_result(episode_url, Art19IE)
return traverse_obj(metadata, {

View File

@ -16,7 +16,7 @@
determine_ext,
float_or_none,
int_or_none,
join_nonempty,
parse_bitrate,
parse_duration,
parse_iso8601,
parse_qs,
@ -24,22 +24,78 @@
qualities,
remove_start,
str_or_none,
traverse_obj,
try_get,
unescapeHTML,
unified_timestamp,
update_url_query,
url_basename,
url_or_none,
urlencode_postdata,
urljoin,
)
from ..utils.traversal import find_element, require, traverse_obj
class NiconicoIE(InfoExtractor):
class NiconicoBaseIE(InfoExtractor):
_GEO_BYPASS = False
_GEO_COUNTRIES = ['JP']
_LOGIN_BASE = 'https://account.nicovideo.jp'
_NETRC_MACHINE = 'niconico'
@property
def is_logged_in(self):
return bool(self._get_cookies('https://www.nicovideo.jp').get('user_session'))
def _raise_login_error(self, message, expected=True):
raise ExtractorError(f'Unable to login: {message}', expected=expected)
def _perform_login(self, username, password):
if self.is_logged_in:
return
self._request_webpage(
f'{self._LOGIN_BASE}/login', None, 'Requesting session cookies')
webpage = self._download_webpage(
f'{self._LOGIN_BASE}/login/redirector', None,
'Logging in', 'Unable to log in', headers={
'Content-Type': 'application/x-www-form-urlencoded',
'Referer': f'{self._LOGIN_BASE}/login',
}, data=urlencode_postdata({
'mail_tel': username,
'password': password,
}))
if self.is_logged_in:
return
elif err_msg := traverse_obj(webpage, (
{find_element(cls='notice error')}, {find_element(cls='notice__text')}, {clean_html},
)):
self._raise_login_error(err_msg or 'Invalid username or password')
elif 'oneTimePw' in webpage:
post_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', webpage, 'post url', group='url')
mfa, urlh = self._download_webpage_handle(
urljoin(self._LOGIN_BASE, post_url), None,
'Performing MFA', 'Unable to complete MFA', headers={
'Content-Type': 'application/x-www-form-urlencoded',
}, data=urlencode_postdata({
'otp': self._get_tfa_info('6 digit number shown on app'),
}))
if self.is_logged_in:
return
elif 'error-code' in parse_qs(urlh.url):
err_msg = traverse_obj(mfa, ({find_element(cls='pageMainMsg')}, {clean_html}))
self._raise_login_error(err_msg or 'MFA session expired')
elif 'formError' in mfa:
err_msg = traverse_obj(mfa, (
{find_element(cls='formError')}, {find_element(tag='div')}, {clean_html}))
self._raise_login_error(err_msg or 'MFA challenge failed')
self._raise_login_error('Unexpected login error', expected=False)
class NiconicoIE(NiconicoBaseIE):
IE_NAME = 'niconico'
IE_DESC = 'ニコニコ動画'
_GEO_COUNTRIES = ['JP']
_GEO_BYPASS = False
_TESTS = [{
'url': 'http://www.nicovideo.jp/watch/sm22312215',
@ -179,229 +235,6 @@ class NiconicoIE(InfoExtractor):
}]
_VALID_URL = r'https?://(?:(?:www\.|secure\.|sp\.)?nicovideo\.jp/watch|nico\.ms)/(?P<id>(?:[a-z]{2})?[0-9]+)'
_NETRC_MACHINE = 'niconico'
_API_HEADERS = {
'X-Frontend-ID': '6',
'X-Frontend-Version': '0',
'X-Niconico-Language': 'en-us',
'Referer': 'https://www.nicovideo.jp/',
'Origin': 'https://www.nicovideo.jp',
}
def _perform_login(self, username, password):
login_ok = True
login_form_strs = {
'mail_tel': username,
'password': password,
}
self._request_webpage(
'https://account.nicovideo.jp/login', None,
note='Acquiring Login session')
page = self._download_webpage(
'https://account.nicovideo.jp/login/redirector?show_button_twitter=1&site=niconico&show_button_facebook=1', None,
note='Logging in', errnote='Unable to log in',
data=urlencode_postdata(login_form_strs),
headers={
'Referer': 'https://account.nicovideo.jp/login',
'Content-Type': 'application/x-www-form-urlencoded',
})
if 'oneTimePw' in page:
post_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', page, 'post url', group='url')
page = self._download_webpage(
urljoin('https://account.nicovideo.jp', post_url), None,
note='Performing MFA', errnote='Unable to complete MFA',
data=urlencode_postdata({
'otp': self._get_tfa_info('6 digits code'),
}), headers={
'Content-Type': 'application/x-www-form-urlencoded',
})
if 'oneTimePw' in page or 'formError' in page:
err_msg = self._html_search_regex(
r'formError["\']+>(.*?)</div>', page, 'form_error',
default='There\'s an error but the message can\'t be parsed.',
flags=re.DOTALL)
self.report_warning(f'Unable to log in: MFA challenge failed, "{err_msg}"')
return False
login_ok = 'class="notice error"' not in page
if not login_ok:
self.report_warning('Unable to log in: bad username or password')
return login_ok
def _get_heartbeat_info(self, info_dict):
video_id, video_src_id, audio_src_id = info_dict['url'].split(':')[1].split('/')
dmc_protocol = info_dict['expected_protocol']
api_data = (
info_dict.get('_api_data')
or self._parse_json(
self._html_search_regex(
'data-api-data="([^"]+)"',
self._download_webpage('https://www.nicovideo.jp/watch/' + video_id, video_id),
'API data', default='{}'),
video_id))
session_api_data = try_get(api_data, lambda x: x['media']['delivery']['movie']['session'])
session_api_endpoint = try_get(session_api_data, lambda x: x['urls'][0])
def ping():
tracking_id = traverse_obj(api_data, ('media', 'delivery', 'trackingId'))
if tracking_id:
tracking_url = update_url_query('https://nvapi.nicovideo.jp/v1/2ab0cbaa/watch', {'t': tracking_id})
watch_request_response = self._download_json(
tracking_url, video_id,
note='Acquiring permission for downloading video', fatal=False,
headers=self._API_HEADERS)
if traverse_obj(watch_request_response, ('meta', 'status')) != 200:
self.report_warning('Failed to acquire permission for playing video. Video download may fail.')
yesno = lambda x: 'yes' if x else 'no'
if dmc_protocol == 'http':
protocol = 'http'
protocol_parameters = {
'http_output_download_parameters': {
'use_ssl': yesno(session_api_data['urls'][0]['isSsl']),
'use_well_known_port': yesno(session_api_data['urls'][0]['isWellKnownPort']),
},
}
elif dmc_protocol == 'hls':
protocol = 'm3u8'
segment_duration = try_get(self._configuration_arg('segment_duration'), lambda x: int(x[0])) or 6000
parsed_token = self._parse_json(session_api_data['token'], video_id)
encryption = traverse_obj(api_data, ('media', 'delivery', 'encryption'))
protocol_parameters = {
'hls_parameters': {
'segment_duration': segment_duration,
'transfer_preset': '',
'use_ssl': yesno(session_api_data['urls'][0]['isSsl']),
'use_well_known_port': yesno(session_api_data['urls'][0]['isWellKnownPort']),
},
}
if 'hls_encryption' in parsed_token and encryption:
protocol_parameters['hls_parameters']['encryption'] = {
parsed_token['hls_encryption']: {
'encrypted_key': encryption['encryptedKey'],
'key_uri': encryption['keyUri'],
},
}
else:
protocol = 'm3u8_native'
else:
raise ExtractorError(f'Unsupported DMC protocol: {dmc_protocol}')
session_response = self._download_json(
session_api_endpoint['url'], video_id,
query={'_format': 'json'},
headers={'Content-Type': 'application/json'},
note='Downloading JSON metadata for {}'.format(info_dict['format_id']),
data=json.dumps({
'session': {
'client_info': {
'player_id': session_api_data.get('playerId'),
},
'content_auth': {
'auth_type': try_get(session_api_data, lambda x: x['authTypes'][session_api_data['protocols'][0]]),
'content_key_timeout': session_api_data.get('contentKeyTimeout'),
'service_id': 'nicovideo',
'service_user_id': session_api_data.get('serviceUserId'),
},
'content_id': session_api_data.get('contentId'),
'content_src_id_sets': [{
'content_src_ids': [{
'src_id_to_mux': {
'audio_src_ids': [audio_src_id],
'video_src_ids': [video_src_id],
},
}],
}],
'content_type': 'movie',
'content_uri': '',
'keep_method': {
'heartbeat': {
'lifetime': session_api_data.get('heartbeatLifetime'),
},
},
'priority': session_api_data['priority'],
'protocol': {
'name': 'http',
'parameters': {
'http_parameters': {
'parameters': protocol_parameters,
},
},
},
'recipe_id': session_api_data.get('recipeId'),
'session_operation_auth': {
'session_operation_auth_by_signature': {
'signature': session_api_data.get('signature'),
'token': session_api_data.get('token'),
},
},
'timing_constraint': 'unlimited',
},
}).encode())
info_dict['url'] = session_response['data']['session']['content_uri']
info_dict['protocol'] = protocol
# get heartbeat info
heartbeat_info_dict = {
'url': session_api_endpoint['url'] + '/' + session_response['data']['session']['id'] + '?_format=json&_method=PUT',
'data': json.dumps(session_response['data']),
# interval, convert milliseconds to seconds, then halve to make a buffer.
'interval': float_or_none(session_api_data.get('heartbeatLifetime'), scale=3000),
'ping': ping,
}
return info_dict, heartbeat_info_dict
def _extract_format_for_quality(self, video_id, audio_quality, video_quality, dmc_protocol):
if not audio_quality.get('isAvailable') or not video_quality.get('isAvailable'):
return None
format_id = '-'.join(
[remove_start(s['id'], 'archive_') for s in (video_quality, audio_quality)] + [dmc_protocol])
vid_qual_label = traverse_obj(video_quality, ('metadata', 'label'))
return {
'url': 'niconico_dmc:{}/{}/{}'.format(video_id, video_quality['id'], audio_quality['id']),
'format_id': format_id,
'format_note': join_nonempty('DMC', vid_qual_label, dmc_protocol.upper(), delim=' '),
'ext': 'mp4', # Session API are used in HTML5, which always serves mp4
'acodec': 'aac',
'vcodec': 'h264',
**traverse_obj(audio_quality, ('metadata', {
'abr': ('bitrate', {float_or_none(scale=1000)}),
'asr': ('samplingRate', {int_or_none}),
})),
**traverse_obj(video_quality, ('metadata', {
'vbr': ('bitrate', {float_or_none(scale=1000)}),
'height': ('resolution', 'height', {int_or_none}),
'width': ('resolution', 'width', {int_or_none}),
})),
'quality': -2 if 'low' in video_quality['id'] else None,
'protocol': 'niconico_dmc',
'expected_protocol': dmc_protocol, # XXX: This is not a documented field
'http_headers': {
'Origin': 'https://www.nicovideo.jp',
'Referer': 'https://www.nicovideo.jp/watch/' + video_id,
},
}
def _yield_dmc_formats(self, api_data, video_id):
dmc_data = traverse_obj(api_data, ('media', 'delivery', 'movie'))
audios = traverse_obj(dmc_data, ('audios', ..., {dict}))
videos = traverse_obj(dmc_data, ('videos', ..., {dict}))
protocols = traverse_obj(dmc_data, ('session', 'protocols', ..., {str}))
if not all((audios, videos, protocols)):
return
for audio_quality, video_quality, protocol in itertools.product(audios, videos, protocols):
if fmt := self._extract_format_for_quality(video_id, audio_quality, video_quality, protocol):
yield fmt
def _yield_dms_formats(self, api_data, video_id):
fmt_filter = lambda _, v: v['isAvailable'] and v['id']
@ -450,42 +283,61 @@ def _yield_dms_formats(self, api_data, video_id):
lambda _, v: v['id'] == video_fmt['format_id'], 'qualityLevel', {int_or_none}, any)) or -1
yield video_fmt
def _extract_server_response(self, webpage, video_id, fatal=True):
try:
return traverse_obj(
self._parse_json(self._html_search_meta('server-response', webpage) or '', video_id),
('data', 'response', {dict}, {require('server response')}))
except ExtractorError:
if not fatal:
return {}
raise
def _real_extract(self, url):
video_id = self._match_id(url)
try:
webpage, handle = self._download_webpage_handle(
'https://www.nicovideo.jp/watch/' + video_id, video_id)
f'https://www.nicovideo.jp/watch/{video_id}', video_id,
headers=self.geo_verification_headers())
if video_id.startswith('so'):
video_id = self._match_id(handle.url)
api_data = traverse_obj(
self._parse_json(self._html_search_meta('server-response', webpage) or '', video_id),
('data', 'response', {dict}))
if not api_data:
raise ExtractorError('Server response data not found')
api_data = self._extract_server_response(webpage, video_id)
except ExtractorError as e:
try:
api_data = self._download_json(
f'https://www.nicovideo.jp/api/watch/v3/{video_id}?_frontendId=6&_frontendVersion=0&actionTrackId=AAAAAAAAAA_{round(time.time() * 1000)}', video_id,
note='Downloading API JSON', errnote='Unable to fetch data')['data']
f'https://www.nicovideo.jp/api/watch/v3/{video_id}', video_id,
'Downloading API JSON', 'Unable to fetch data', query={
'_frontendId': '6',
'_frontendVersion': '0',
'actionTrackId': f'AAAAAAAAAA_{round(time.time() * 1000)}',
}, headers=self.geo_verification_headers())['data']
except ExtractorError:
if not isinstance(e.cause, HTTPError):
# Raise if original exception was from _parse_json or utils.traversal.require
raise
# The webpage server response has more detailed error info than the API response
webpage = e.cause.response.read().decode('utf-8', 'replace')
error_msg = self._html_search_regex(
r'(?s)<section\s+class="(?:(?:ErrorMessage|WatchExceptionPage-message)\s*)+">(.+?)</section>',
webpage, 'error reason', default=None)
if not error_msg:
reason_code = self._extract_server_response(
webpage, video_id, fatal=False).get('reasonCode')
if not reason_code:
raise
raise ExtractorError(clean_html(error_msg), expected=True)
if reason_code in ('DOMESTIC_VIDEO', 'HIGH_RISK_COUNTRY_VIDEO'):
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
elif reason_code == 'HIDDEN_VIDEO':
raise ExtractorError(
'The viewing period of this video has expired', expected=True)
elif reason_code == 'DELETED_VIDEO':
raise ExtractorError('This video has been deleted', expected=True)
raise ExtractorError(f'Niconico says: {reason_code}')
availability = self._availability(**(traverse_obj(api_data, ('payment', 'video', {
'needs_premium': ('isPremium', {bool}),
'needs_subscription': ('isAdmission', {bool}),
})) or {'needs_auth': True}))
formats = [*self._yield_dmc_formats(api_data, video_id),
*self._yield_dms_formats(api_data, video_id)]
formats = list(self._yield_dms_formats(api_data, video_id))
if not formats:
fail_msg = clean_html(self._html_search_regex(
r'<p[^>]+\bclass="fail-message"[^>]*>(?P<msg>.+?)</p>',
@ -920,7 +772,7 @@ def _real_extract(self, url):
return self.playlist_result(self._entries(list_id), list_id)
class NiconicoLiveIE(InfoExtractor):
class NiconicoLiveIE(NiconicoBaseIE):
IE_NAME = 'niconico:live'
IE_DESC = 'ニコニコ生放送'
_VALID_URL = r'https?://(?:sp\.)?live2?\.nicovideo\.jp/(?:watch|gate)/(?P<id>lv\d+)'
@ -952,8 +804,6 @@ class NiconicoLiveIE(InfoExtractor):
'only_matching': True,
}]
_KNOWN_LATENCY = ('high', 'low')
def _real_extract(self, url):
video_id = self._match_id(url)
webpage, urlh = self._download_webpage_handle(f'https://live.nicovideo.jp/watch/{video_id}', video_id)
@ -969,22 +819,20 @@ def _real_extract(self, url):
})
hostname = remove_start(urllib.parse.urlparse(urlh.url).hostname, 'sp.')
latency = try_get(self._configuration_arg('latency'), lambda x: x[0])
if latency not in self._KNOWN_LATENCY:
latency = 'high'
ws = self._request_webpage(
Request(ws_url, headers={'Origin': f'https://{hostname}'}),
video_id=video_id, note='Connecting to WebSocket server')
self.write_debug('[debug] Sending HLS server request')
self.write_debug('Sending HLS server request')
ws.send(json.dumps({
'type': 'startWatching',
'data': {
'stream': {
'quality': 'abr',
'protocol': 'hls+fmp4',
'latency': latency,
'protocol': 'hls',
'latency': 'high',
'accessRightMethod': 'single_cookie',
'chasePlay': False,
},
'room': {
@ -1005,6 +853,7 @@ def _real_extract(self, url):
if data.get('type') == 'stream':
m3u8_url = data['data']['uri']
qualities = data['data']['availableQualities']
cookies = data['data']['cookies']
break
elif data.get('type') == 'disconnect':
self.write_debug(recv)
@ -1043,16 +892,32 @@ def _real_extract(self, url):
**res,
})
formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', live=True)
for fmt, q in zip(formats, reversed(qualities[1:])):
fmt.update({
'format_id': q,
'protocol': 'niconico_live',
'ws': ws,
'video_id': video_id,
'live_latency': latency,
for cookie in cookies:
self._set_cookie(
cookie['domain'], cookie['name'], cookie['value'],
expire_time=unified_timestamp(cookie.get('expires')), path=cookie['path'], secure=cookie['secure'])
fmt_common = {
'live_latency': 'high',
'origin': hostname,
'protocol': 'niconico_live',
'video_id': video_id,
'ws': ws,
}
q_iter = (q for q in qualities[1:] if not q.startswith('audio_')) # ignore initial 'abr'
a_map = {96: 'audio_low', 192: 'audio_high'}
formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', live=True)
for fmt in formats:
if fmt.get('acodec') == 'none':
fmt['format_id'] = next(q_iter, fmt['format_id'])
elif fmt.get('vcodec') == 'none':
abr = parse_bitrate(fmt['url'].lower())
fmt.update({
'abr': abr,
'format_id': a_map.get(abr, fmt['format_id']),
})
fmt.update(fmt_common)
return {
'id': video_id,

View File

@ -1,34 +1,46 @@
import json
import re
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
smuggle_url,
parse_iso8601,
parse_resolution,
str_or_none,
try_get,
unified_strdate,
unified_timestamp,
url_or_none,
)
from ..utils.traversal import require, traverse_obj, value
class NineNowIE(InfoExtractor):
IE_NAME = '9now.com.au'
_VALID_URL = r'https?://(?:www\.)?9now\.com\.au/(?:[^/]+/){2}(?P<id>[^/?#]+)'
_GEO_COUNTRIES = ['AU']
_VALID_URL = r'https?://(?:www\.)?9now\.com\.au/(?:[^/?#]+/){2}(?P<id>(?P<type>clip|episode)-[^/?#]+)'
_GEO_BYPASS = False
_TESTS = [{
# clip
'url': 'https://www.9now.com.au/afl-footy-show/2016/clip-ciql02091000g0hp5oktrnytc',
'md5': '17cf47d63ec9323e562c9957a968b565',
'url': 'https://www.9now.com.au/today/season-2025/clip-cm8hw9h5z00080hquqa5hszq7',
'info_dict': {
'id': '16801',
'id': '6370295582112',
'ext': 'mp4',
'title': 'St. Kilda\'s Joey Montagna on the potential for a player\'s strike',
'description': 'Is a boycott of the NAB Cup "on the table"?',
'title': 'Would Karl Stefanovic be able to land a plane?',
'description': 'The Today host\'s skills are put to the test with the latest simulation tech.',
'uploader_id': '4460760524001',
'upload_date': '20160713',
'timestamp': 1468421266,
'duration': 197.376,
'tags': ['flights', 'technology', 'Karl Stefanovic'],
'season': 'Season 2025',
'season_number': 2025,
'series': 'TODAY',
'timestamp': 1742507988,
'upload_date': '20250320',
'release_timestamp': 1742507983,
'release_date': '20250320',
'thumbnail': r're:https?://.+/1920x0/.+\.jpg',
},
'params': {
'skip_download': 'HLS/DASH fragments and mp4 URLs are geo-restricted; only available in AU',
},
'skip': 'Only available in Australia',
}, {
# episode
'url': 'https://www.9now.com.au/afl-footy-show/2016/episode-19',
@ -41,7 +53,7 @@ class NineNowIE(InfoExtractor):
# episode of series
'url': 'https://www.9now.com.au/lego-masters/season-3/episode-3',
'info_dict': {
'id': '6249614030001',
'id': '6308830406112',
'title': 'Episode 3',
'ext': 'mp4',
'season_number': 3,
@ -50,72 +62,87 @@ class NineNowIE(InfoExtractor):
'uploader_id': '4460760524001',
'timestamp': 1619002200,
'upload_date': '20210421',
'duration': 3574.085,
'thumbnail': r're:https?://.+/1920x0/.+\.jpg',
'tags': ['episode'],
'series': 'Lego Masters',
'season': 'Season 3',
'episode': 'Episode 3',
'release_timestamp': 1619002200,
'release_date': '20210421',
},
'expected_warnings': ['Ignoring subtitle tracks'],
'params': {
'skip_download': True,
'skip_download': 'HLS/DASH fragments and mp4 URLs are geo-restricted; only available in AU',
},
}, {
'url': 'https://www.9now.com.au/married-at-first-sight/season-12/episode-1',
'info_dict': {
'id': '6367798770112',
'ext': 'mp4',
'title': 'Episode 1',
'description': r're:The cultural sensation of Married At First Sight returns with our first weddings! .{90}$',
'uploader_id': '4460760524001',
'duration': 5415.079,
'thumbnail': r're:https?://.+/1920x0/.+\.png',
'tags': ['episode'],
'season': 'Season 12',
'season_number': 12,
'episode': 'Episode 1',
'episode_number': 1,
'series': 'Married at First Sight',
'timestamp': 1737973800,
'upload_date': '20250127',
'release_timestamp': 1737973800,
'release_date': '20250127',
},
'params': {
'skip_download': 'HLS/DASH fragments and mp4 URLs are geo-restricted; only available in AU',
},
}]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/4460760524001/default_default/index.html?videoId=%s'
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/4460760524001/default_default/index.html?videoId={}'
# XXX: For parsing next.js v15+ data; see also yt_dlp.extractor.francetv and yt_dlp.extractor.goplay
def _find_json(self, s):
return self._search_json(
r'\w+\s*:\s*', s, 'next js data', None, contains_pattern=r'\[(?s:.+)\]', default=None)
def _real_extract(self, url):
display_id = self._match_id(url)
display_id, video_type = self._match_valid_url(url).group('id', 'type')
webpage = self._download_webpage(url, display_id)
page_data = self._parse_json(self._search_regex(
r'window\.__data\s*=\s*({.*?});', webpage,
'page data', default='{}'), display_id, fatal=False)
if not page_data:
page_data = self._parse_json(self._parse_json(self._search_regex(
r'window\.__data\s*=\s*JSON\.parse\s*\(\s*(".+?")\s*\)\s*;',
webpage, 'page data'), display_id), display_id)
for kind in ('episode', 'clip'):
current_key = page_data.get(kind, {}).get(
f'current{kind.capitalize()}Key')
if not current_key:
continue
cache = page_data.get(kind, {}).get(f'{kind}Cache', {})
if not cache:
continue
common_data = {
'episode': (cache.get(current_key) or next(iter(cache.values())))[kind],
'season': (cache.get(current_key) or next(iter(cache.values()))).get('season', None),
}
break
else:
raise ExtractorError('Unable to find video data')
common_data = traverse_obj(
re.findall(r'<script[^>]*>\s*self\.__next_f\.push\(\s*(\[.+?\])\s*\);?\s*</script>', webpage),
(..., {json.loads}, ..., {self._find_json},
lambda _, v: v['payload'][video_type]['slug'] == display_id,
'payload', any, {require('video data')}))
if not self.get_param('allow_unplayable_formats') and try_get(common_data, lambda x: x['episode']['video']['drm'], bool):
if traverse_obj(common_data, (video_type, 'video', 'drm', {bool})):
self.report_drm(display_id)
brightcove_id = try_get(
common_data, lambda x: x['episode']['video']['brightcoveId'], str) or 'ref:{}'.format(common_data['episode']['video']['referenceId'])
video_id = str_or_none(try_get(common_data, lambda x: x['episode']['video']['id'])) or brightcove_id
title = try_get(common_data, lambda x: x['episode']['name'], str)
season_number = try_get(common_data, lambda x: x['season']['seasonNumber'], int)
episode_number = try_get(common_data, lambda x: x['episode']['episodeNumber'], int)
timestamp = unified_timestamp(try_get(common_data, lambda x: x['episode']['airDate'], str))
release_date = unified_strdate(try_get(common_data, lambda x: x['episode']['availability'], str))
thumbnails_data = try_get(common_data, lambda x: x['episode']['image']['sizes'], dict) or {}
thumbnails = [{
'id': thumbnail_id,
'url': thumbnail_url,
'width': int_or_none(thumbnail_id[1:]),
} for thumbnail_id, thumbnail_url in thumbnails_data.items()]
brightcove_id = traverse_obj(common_data, (
video_type, 'video', (
('brightcoveId', {str}),
('referenceId', {str}, {lambda x: f'ref:{x}' if x else None}),
), any, {require('brightcove ID')}))
return {
'_type': 'url_transparent',
'url': smuggle_url(
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
{'geo_countries': self._GEO_COUNTRIES}),
'id': video_id,
'title': title,
'description': try_get(common_data, lambda x: x['episode']['description'], str),
'duration': float_or_none(try_get(common_data, lambda x: x['episode']['video']['duration'], float), 1000),
'thumbnails': thumbnails,
'ie_key': 'BrightcoveNew',
'season_number': season_number,
'episode_number': episode_number,
'timestamp': timestamp,
'release_date': release_date,
'ie_key': BrightcoveNewIE.ie_key(),
'url': self.BRIGHTCOVE_URL_TEMPLATE.format(brightcove_id),
**traverse_obj(common_data, {
'id': (video_type, 'video', 'id', {int}, ({str_or_none}, {value(brightcove_id)}), any),
'title': (video_type, 'name', {str}),
'description': (video_type, 'description', {str}),
'duration': (video_type, 'video', 'duration', {float_or_none(scale=1000)}),
'tags': (video_type, 'tags', ..., 'name', {str}, all, filter),
'series': ('tvSeries', 'name', {str}),
'season_number': ('season', 'seasonNumber', {int_or_none}),
'episode_number': ('episode', 'episodeNumber', {int_or_none}),
'timestamp': ('episode', 'airDate', {parse_iso8601}),
'release_timestamp': (video_type, 'availability', {parse_iso8601}),
'thumbnails': (video_type, 'image', 'sizes', {dict.items}, lambda _, v: url_or_none(v[1]), {
'id': 0,
'url': 1,
'width': (1, {parse_resolution}, 'width'),
}),
}),
}

View File

@ -181,6 +181,7 @@ class NYTimesArticleIE(NYTimesBaseIE):
'thumbnail': r're:https?://\w+\.nyt.com/images/.*\.jpg',
'duration': 119.0,
},
'skip': 'HTTP Error 500: Internal Server Error',
}, {
# article with audio and no video
'url': 'https://www.nytimes.com/2023/09/29/health/mosquitoes-genetic-engineering.html',
@ -190,13 +191,14 @@ class NYTimesArticleIE(NYTimesBaseIE):
'ext': 'mp3',
'title': 'The Gamble: Can Genetically Modified Mosquitoes End Disease?',
'description': 'md5:9ff8b47acbaf7f3ca8c732f5c815be2e',
'timestamp': 1695960700,
'timestamp': 1696008129,
'upload_date': '20230929',
'creator': 'Stephanie Nolen, Natalija Gormalova',
'creators': ['Stephanie Nolen', 'Natalija Gormalova'],
'thumbnail': r're:https?://\w+\.nyt.com/images/.*\.jpg',
'duration': 1322,
},
}, {
# lede_media_block already has sourceId
'url': 'https://www.nytimes.com/2023/11/29/business/dealbook/kamala-harris-biden-voters.html',
'md5': '3eb5ddb1d6f86254fe4f233826778737',
'info_dict': {
@ -207,7 +209,7 @@ class NYTimesArticleIE(NYTimesBaseIE):
'timestamp': 1701290997,
'upload_date': '20231129',
'uploader': 'By The New York Times',
'creator': 'Katie Rogers',
'creators': ['Katie Rogers'],
'thumbnail': r're:https?://\w+\.nyt.com/images/.*\.jpg',
'duration': 97.631,
},
@ -222,10 +224,22 @@ class NYTimesArticleIE(NYTimesBaseIE):
'title': 'Drunk and Asleep on the Job: Air Traffic Controllers Pushed to the Brink',
'description': 'md5:549e5a5e935bf7d048be53ba3d2c863d',
'upload_date': '20231202',
'creator': 'Emily Steel, Sydney Ember',
'creators': ['Emily Steel', 'Sydney Ember'],
'timestamp': 1701511264,
},
'playlist_count': 3,
}, {
# lede_media_block does not have sourceId
'url': 'https://www.nytimes.com/2025/04/30/well/move/hip-mobility-routine.html',
'info_dict': {
'id': 'hip-mobility-routine',
'title': 'Tight Hips? These Moves Can Help.',
'description': 'Sitting all day is hard on your hips. Try this simple routine for better mobility.',
'creators': ['Alyssa Ages', 'Theodore Tae'],
'timestamp': 1746003629,
'upload_date': '20250430',
},
'playlist_count': 7,
}, {
'url': 'https://www.nytimes.com/2023/12/02/business/media/netflix-squid-game-challenge.html',
'only_matching': True,
@ -256,14 +270,18 @@ def _extract_content_from_block(self, block):
def _real_extract(self, url):
page_id = self._match_id(url)
webpage = self._download_webpage(url, page_id)
webpage = self._download_webpage(url, page_id, impersonate=True)
art_json = self._search_json(
r'window\.__preloadedData\s*=', webpage, 'media details', page_id,
transform_source=lambda x: x.replace('undefined', 'null'))['initialData']['data']['article']
content = art_json['sprinkledBody']['content']
blocks = traverse_obj(art_json, (
'sprinkledBody', 'content', ..., ('ledeMedia', None),
lambda _, v: v['__typename'] in ('Video', 'Audio')))
blocks = []
block_filter = lambda k, v: k == 'media' and v['__typename'] in ('Video', 'Audio')
if lede_media_block := traverse_obj(content, (..., 'ledeMedia', block_filter, any)):
lede_media_block.setdefault('sourceId', art_json.get('sourceId'))
blocks.append(lede_media_block)
blocks.extend(traverse_obj(content, (..., block_filter)))
if not blocks:
raise ExtractorError('Unable to extract any media blocks from webpage')
@ -273,8 +291,7 @@ def _real_extract(self, url):
'sprinkledBody', 'content', ..., 'summary', 'content', ..., 'text', {str}),
get_all=False) or self._html_search_meta(['og:description', 'twitter:description'], webpage),
'timestamp': traverse_obj(art_json, ('firstPublished', {parse_iso8601})),
'creator': ', '.join(
traverse_obj(art_json, ('bylines', ..., 'creators', ..., 'displayName'))), # TODO: change to 'creators' (list)
'creators': traverse_obj(art_json, ('bylines', ..., 'creators', ..., 'displayName', {str})),
'thumbnails': self._extract_thumbnails(traverse_obj(
art_json, ('promotionalMedia', 'assetCrops', ..., 'renditions', ...))),
}

View File

@ -11,12 +11,15 @@ class On24IE(InfoExtractor):
IE_NAME = 'on24'
IE_DESC = 'ON24'
_VALID_URL = r'''(?x)
https?://event\.on24\.com/(?:
wcc/r/(?P<id_1>\d{7})/(?P<key_1>[0-9A-F]{32})|
eventRegistration/(?:console/EventConsoleApollo|EventLobbyServlet\?target=lobby30)
\.jsp\?(?:[^/#?]*&)?eventid=(?P<id_2>\d{7})[^/#?]*&key=(?P<key_2>[0-9A-F]{32})
)'''
_ID_RE = r'(?P<id>\d{7})'
_KEY_RE = r'(?P<key>[0-9A-F]{32})'
_URL_BASE_RE = r'https?://event\.on24\.com'
_URL_QUERY_RE = rf'(?:[^#]*&)?eventid={_ID_RE}&(?:[^#]+&)?key={_KEY_RE}'
_VALID_URL = [
rf'{_URL_BASE_RE}/wcc/r/{_ID_RE}/{_KEY_RE}',
rf'{_URL_BASE_RE}/eventRegistration/console/(?:EventConsoleApollo\.jsp|apollox/mainEvent/?)\?{_URL_QUERY_RE}',
rf'{_URL_BASE_RE}/eventRegistration/EventLobbyServlet/?\?{_URL_QUERY_RE}',
]
_TESTS = [{
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?uimode=nextgeneration&eventid=2197467&sessionid=1&key=5DF57BE53237F36A43B478DD36277A84&contenttype=A&eventuserid=305999&playerwidth=1000&playerheight=650&caller=previewLobby&text_language_id=en&format=fhaudio&newConsole=false',
@ -34,12 +37,16 @@ class On24IE(InfoExtractor):
}, {
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?&eventid=2639291&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=82829018E813065A122363877975752E&newConsole=true&nxChe=true&newTabCon=true&text_language_id=en&playerwidth=748&playerheight=526&eventuserid=338788762&contenttype=A&mediametricsessionid=384764716&mediametricid=3558192&usercd=369267058&mode=launch',
'only_matching': True,
}, {
'url': 'https://event.on24.com/eventRegistration/EventLobbyServlet?target=reg20.jsp&eventid=3543176&key=BC0F6B968B67C34B50D461D40FDB3E18&groupId=3143628',
'only_matching': True,
}, {
'url': 'https://event.on24.com/eventRegistration/console/apollox/mainEvent?&eventid=4843671&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=4EAC9B5C564CC98FF29E619B06A2F743&newConsole=true&nxChe=true&newTabCon=true&consoleEarEventConsole=false&consoleEarCloudApi=false&text_language_id=en&playerwidth=748&playerheight=526&referrer=https%3A%2F%2Fevent.on24.com%2Finterface%2Fregistration%2Fautoreg%2Findex.html%3Fsessionid%3D1%26eventid%3D4843671%26key%3D4EAC9B5C564CC98FF29E619B06A2F743%26email%3D000a3e42-7952-4dd6-8f8a-34c38ea3cf02%2540platform%26firstname%3Ds%26lastname%3Ds%26deletecookie%3Dtrue%26event_email%3DN%26marketing_email%3DN%26std1%3D0642572014177%26std2%3D0642572014179%26std3%3D550165f7-a44e-4725-9fe6-716f89908c2b%26std4%3D0&eventuserid=745776448&contenttype=A&mediametricsessionid=640613707&mediametricid=6810717&usercd=745776448&mode=launch',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = self._match_valid_url(url)
event_id = mobj.group('id_1') or mobj.group('id_2')
event_key = mobj.group('key_1') or mobj.group('key_2')
event_id, event_key = self._match_valid_url(url).group('id', 'key')
event_data = self._download_json(
'https://event.on24.com/apic/utilApp/EventConsoleCachedServlet',

View File

@ -1,40 +0,0 @@
import re
from .common import InfoExtractor
class OnceIE(InfoExtractor): # XXX: Conventionally, base classes should end with BaseIE/InfoExtractor
_VALID_URL = r'https?://.+?\.unicornmedia\.com/now/(?:ads/vmap/)?[^/]+/[^/]+/(?P<domain_id>[^/]+)/(?P<application_id>[^/]+)/(?:[^/]+/)?(?P<media_item_id>[^/]+)/content\.(?:once|m3u8|mp4)'
ADAPTIVE_URL_TEMPLATE = 'http://once.unicornmedia.com/now/master/playlist/%s/%s/%s/content.m3u8'
PROGRESSIVE_URL_TEMPLATE = 'http://once.unicornmedia.com/now/media/progressive/%s/%s/%s/%s/content.mp4'
def _extract_once_formats(self, url, http_formats_preference=None):
domain_id, application_id, media_item_id = re.match(
OnceIE._VALID_URL, url).groups()
formats = self._extract_m3u8_formats(
self.ADAPTIVE_URL_TEMPLATE % (
domain_id, application_id, media_item_id),
media_item_id, 'mp4', m3u8_id='hls', fatal=False)
progressive_formats = []
for adaptive_format in formats:
# Prevent advertisement from embedding into m3u8 playlist (see
# https://github.com/ytdl-org/youtube-dl/issues/8893#issuecomment-199912684)
adaptive_format['url'] = re.sub(
r'\badsegmentlength=\d+', r'adsegmentlength=0', adaptive_format['url'])
rendition_id = self._search_regex(
r'/now/media/playlist/[^/]+/[^/]+/([^/]+)',
adaptive_format['url'], 'redition id', default=None)
if rendition_id:
progressive_format = adaptive_format.copy()
progressive_format.update({
'url': self.PROGRESSIVE_URL_TEMPLATE % (
domain_id, application_id, rendition_id, media_item_id),
'format_id': adaptive_format['format_id'].replace(
'hls', 'http'),
'protocol': 'http',
'preference': http_formats_preference,
})
progressive_formats.append(progressive_format)
self._check_formats(progressive_formats, media_item_id)
formats.extend(progressive_formats)
return formats

View File

@ -14,8 +14,9 @@
int_or_none,
parse_qs,
srt_subtitles_timecode,
traverse_obj,
url_or_none,
)
from ..utils.traversal import traverse_obj
class PanoptoBaseIE(InfoExtractor):
@ -345,21 +346,16 @@ def _extract_streams_formats_and_subtitles(self, video_id, streams, **fmt_kwargs
subtitles = {}
for stream in streams or []:
stream_formats = []
http_stream_url = stream.get('StreamHttpUrl')
stream_url = stream.get('StreamUrl')
if http_stream_url:
stream_formats.append({'url': http_stream_url})
if stream_url:
for stream_url in set(traverse_obj(stream, (('StreamHttpUrl', 'StreamUrl'), {url_or_none}))):
media_type = stream.get('ViewerMediaFileTypeName')
if media_type in ('hls', ):
m3u8_formats, stream_subtitles = self._extract_m3u8_formats_and_subtitles(stream_url, video_id)
stream_formats.extend(m3u8_formats)
subtitles = self._merge_subtitles(subtitles, stream_subtitles)
fmts, subs = self._extract_m3u8_formats_and_subtitles(stream_url, video_id, m3u8_id='hls', fatal=False)
stream_formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
stream_formats.append({
'url': stream_url,
'ext': media_type,
})
for fmt in stream_formats:
fmt.update({

101
yt_dlp/extractor/parti.py Normal file
View File

@ -0,0 +1,101 @@
from .common import InfoExtractor
from ..utils import UserNotLive, int_or_none, parse_iso8601, url_or_none, urljoin
from ..utils.traversal import traverse_obj
class PartiBaseIE(InfoExtractor):
def _call_api(self, path, video_id, note=None):
return self._download_json(
f'https://api-backend.parti.com/parti_v2/profile/{path}', video_id, note)
class PartiVideoIE(PartiBaseIE):
IE_NAME = 'parti:video'
_VALID_URL = r'https?://(?:www\.)?parti\.com/video/(?P<id>\d+)'
_TESTS = [{
'url': 'https://parti.com/video/66284',
'info_dict': {
'id': '66284',
'ext': 'mp4',
'title': 'NOW LIVE ',
'upload_date': '20250327',
'categories': ['Gaming'],
'thumbnail': 'https://assets.parti.com/351424_eb9e5250-2821-484a-9c5f-ca99aa666c87.png',
'channel': 'ItZTMGG',
'timestamp': 1743044379,
},
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._call_api(f'get_livestream_channel_info/recent/{video_id}', video_id)
return {
'id': video_id,
'formats': self._extract_m3u8_formats(
urljoin('https://watch.parti.com', data['livestream_recording']), video_id, 'mp4'),
**traverse_obj(data, {
'title': ('event_title', {str}),
'channel': ('user_name', {str}),
'thumbnail': ('event_file', {url_or_none}),
'categories': ('category_name', {str}, filter, all),
'timestamp': ('event_start_ts', {int_or_none}),
}),
}
class PartiLivestreamIE(PartiBaseIE):
IE_NAME = 'parti:livestream'
_VALID_URL = r'https?://(?:www\.)?parti\.com/creator/(?P<service>[\w]+)/(?P<id>[\w/-]+)'
_TESTS = [{
'url': 'https://parti.com/creator/parti/Capt_Robs_Adventures',
'info_dict': {
'id': 'Capt_Robs_Adventures',
'ext': 'mp4',
'title': r"re:I'm Live on Parti \d{4}-\d{2}-\d{2} \d{2}:\d{2}",
'view_count': int,
'thumbnail': r're:https://assets\.parti\.com/.+\.png',
'timestamp': 1743879776,
'upload_date': '20250405',
'live_status': 'is_live',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://parti.com/creator/discord/sazboxgaming/0',
'only_matching': True,
}]
def _real_extract(self, url):
service, creator_slug = self._match_valid_url(url).group('service', 'id')
encoded_creator_slug = creator_slug.replace('/', '%23')
creator_id = self._call_api(
f'get_user_by_social_media/{service}/{encoded_creator_slug}',
creator_slug, note='Fetching user ID')
data = self._call_api(
f'get_livestream_channel_info/{creator_id}', creator_id,
note='Fetching user profile feed')['channel_info']
if not traverse_obj(data, ('channel', 'is_live', {bool})):
raise UserNotLive(video_id=creator_id)
channel_info = data['channel']
return {
'id': creator_slug,
'formats': self._extract_m3u8_formats(
channel_info['playback_url'], creator_slug, live=True, query={
'token': channel_info['playback_auth_token'],
'player_version': '1.17.0',
}),
'is_live': True,
**traverse_obj(data, {
'title': ('livestream_event_info', 'event_name', {str}),
'description': ('livestream_event_info', 'event_description', {str}),
'thumbnail': ('livestream_event_info', 'livestream_preview_file', {url_or_none}),
'timestamp': ('stream', 'start_time', {parse_iso8601}),
'view_count': ('stream', 'viewer_count', {int_or_none}),
}),
}

View File

@ -340,8 +340,9 @@ def _real_extract(self, url):
'channel_follower_count': ('attributes', 'patron_count', {int_or_none}),
}))
# all-lowercase 'referer' so we can smuggle it to Generic, SproutVideo, Vimeo
headers = {'referer': 'https://patreon.com/'}
# Must be all-lowercase 'referer' so we can smuggle it to Generic, SproutVideo, and Vimeo.
# patreon.com URLs redirect to www.patreon.com; this matters when requesting mux.com m3u8s
headers = {'referer': 'https://www.patreon.com/'}
# handle Vimeo embeds
if traverse_obj(attributes, ('embed', 'provider')) == 'Vimeo':
@ -352,7 +353,7 @@ def _real_extract(self, url):
v_url, video_id, 'Checking Vimeo embed URL', headers=headers,
fatal=False, errnote=False, expected_status=429): # 429 is TLS fingerprint rejection
entries.append(self.url_result(
VimeoIE._smuggle_referrer(v_url, 'https://patreon.com/'),
VimeoIE._smuggle_referrer(v_url, headers['referer']),
VimeoIE, url_transparent=True))
embed_url = traverse_obj(attributes, ('embed', 'url', {url_or_none}))
@ -379,11 +380,13 @@ def _real_extract(self, url):
'url': post_file['url'],
})
elif name == 'video' or determine_ext(post_file.get('url')) == 'm3u8':
formats, subtitles = self._extract_m3u8_formats_and_subtitles(post_file['url'], video_id)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
post_file['url'], video_id, headers=headers)
entries.append({
'id': video_id,
'formats': formats,
'subtitles': subtitles,
'http_headers': headers,
})
can_view_post = traverse_obj(attributes, 'current_user_can_view')

View File

@ -1,5 +1,3 @@
import re
from .youtube import YoutubeIE
from .zdf import ZDFBaseIE
from ..utils import (
@ -7,44 +5,27 @@
merge_dicts,
try_get,
unified_timestamp,
urljoin,
)
class PhoenixIE(ZDFBaseIE):
IE_NAME = 'phoenix.de'
_VALID_URL = r'https?://(?:www\.)?phoenix\.de/(?:[^/]+/)*[^/?#&]*-a-(?P<id>\d+)\.html'
_VALID_URL = r'https?://(?:www\.)?phoenix\.de/(?:[^/?#]+/)*[^/?#&]*-a-(?P<id>\d+)\.html'
_TESTS = [{
# Same as https://www.zdf.de/politik/phoenix-sendungen/wohin-fuehrt-der-protest-in-der-pandemie-100.html
'url': 'https://www.phoenix.de/sendungen/ereignisse/corona-nachgehakt/wohin-fuehrt-der-protest-in-der-pandemie-a-2050630.html',
'md5': '34ec321e7eb34231fd88616c65c92db0',
'url': 'https://www.phoenix.de/sendungen/dokumentationen/spitzbergen-a-893349.html',
'md5': 'a79e86d9774d0b3f2102aff988a0bd32',
'info_dict': {
'id': '210222_phx_nachgehakt_corona_protest',
'id': '221215_phx_spitzbergen',
'ext': 'mp4',
'title': 'Wohin führt der Protest in der Pandemie?',
'description': 'md5:7d643fe7f565e53a24aac036b2122fbd',
'duration': 1691,
'timestamp': 1613902500,
'upload_date': '20210221',
'title': 'Spitzbergen',
'description': 'Film von Tilmann Bünz',
'duration': 728.0,
'timestamp': 1555600500,
'upload_date': '20190418',
'uploader': 'Phoenix',
'series': 'corona nachgehakt',
'episode': 'Wohin führt der Protest in der Pandemie?',
},
}, {
# Youtube embed
'url': 'https://www.phoenix.de/sendungen/gespraeche/phoenix-streitgut-brennglas-corona-a-1965505.html',
'info_dict': {
'id': 'hMQtqFYjomk',
'ext': 'mp4',
'title': 'phoenix streitgut: Brennglas Corona - Wie gerecht ist unsere Gesellschaft?',
'description': 'md5:ac7a02e2eb3cb17600bc372e4ab28fdd',
'duration': 3509,
'upload_date': '20201219',
'uploader': 'phoenix',
'uploader_id': 'phoenix',
},
'params': {
'skip_download': True,
'thumbnail': 'https://www.phoenix.de/sixcms/media.php/21/Bergspitzen1.png',
'series': 'Dokumentationen',
'episode': 'Spitzbergen',
},
}, {
'url': 'https://www.phoenix.de/entwicklungen-in-russland-a-2044720.html',
@ -90,8 +71,8 @@ def _real_extract(self, url):
content_id = details['tracking']['nielsen']['content']['assetid']
info = self._extract_ptmd(
f'https://tmd.phoenix.de/tmd/2/ngplayer_2_3/vod/ptmd/phoenix/{content_id}',
content_id, None, url)
f'https://tmd.phoenix.de/tmd/2/android_native_6/vod/ptmd/phoenix/{content_id}',
content_id)
duration = int_or_none(try_get(
details, lambda x: x['tracking']['nielsen']['content']['length']))
@ -101,20 +82,8 @@ def _real_extract(self, url):
str)
episode = title if details.get('contentType') == 'episode' else None
thumbnails = []
teaser_images = try_get(details, lambda x: x['teaserImageRef']['layouts'], dict) or {}
for thumbnail_key, thumbnail_url in teaser_images.items():
thumbnail_url = urljoin(url, thumbnail_url)
if not thumbnail_url:
continue
thumbnail = {
'url': thumbnail_url,
}
m = re.match('^([0-9]+)x([0-9]+)$', thumbnail_key)
if m:
thumbnail['width'] = int(m.group(1))
thumbnail['height'] = int(m.group(2))
thumbnails.append(thumbnail)
thumbnails = self._extract_thumbnails(teaser_images)
return merge_dicts(info, {
'id': content_id,

View File

@ -10,7 +10,8 @@
class PicartoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www.)?picarto\.tv/(?P<id>[a-zA-Z0-9]+)'
IE_NAME = 'picarto'
_VALID_URL = r'https?://(?:www.)?picarto\.tv/(?P<id>[^/#?]+)/?(?:$|[?#])'
_TEST = {
'url': 'https://picarto.tv/Setz',
'info_dict': {
@ -89,7 +90,8 @@ def _real_extract(self, url):
class PicartoVodIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?picarto\.tv/(?:videopopout|\w+/videos)/(?P<id>[^/?#&]+)'
IE_NAME = 'picarto:vod'
_VALID_URL = r'https?://(?:www\.)?picarto\.tv/(?:videopopout|\w+(?:/profile)?/videos)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://picarto.tv/videopopout/ArtofZod_2017.12.12.00.13.23.flv',
'md5': '3ab45ba4352c52ee841a28fb73f2d9ca',
@ -111,6 +113,18 @@ class PicartoVodIE(InfoExtractor):
'channel': 'ArtofZod',
'age_limit': 18,
},
}, {
'url': 'https://picarto.tv/DrechuArt/profile/videos/400347',
'md5': 'f9ea54868b1d9dec40eb554b484cc7bf',
'info_dict': {
'id': '400347',
'ext': 'mp4',
'title': 'Welcome to the Show',
'thumbnail': r're:^https?://.*\.jpg',
'channel': 'DrechuArt',
'age_limit': 0,
},
}, {
'url': 'https://picarto.tv/videopopout/Plague',
'only_matching': True,

View File

@ -7,11 +7,12 @@
from ..utils import (
ExtractorError,
int_or_none,
join_nonempty,
parse_qs,
traverse_obj,
update_url_query,
urlencode_postdata,
)
from ..utils.traversal import traverse_obj, unpack
class PlaySuisseIE(InfoExtractor):
@ -26,12 +27,12 @@ class PlaySuisseIE(InfoExtractor):
{
# episode in a series
'url': 'https://www.playsuisse.ch/watch/763182?episodeId=763211',
'md5': '82df2a470b2dfa60c2d33772a8a60cf8',
'md5': 'e20d1ede6872a03b41905ca1060a1ef2',
'info_dict': {
'id': '763211',
'ext': 'mp4',
'title': 'Knochen',
'description': 'md5:8ea7a8076ba000cd9e8bc132fd0afdd8',
'description': 'md5:3bdd80e2ce20227c47aab1df2a79a519',
'duration': 3344,
'series': 'Wilder',
'season': 'Season 1',
@ -42,24 +43,33 @@ class PlaySuisseIE(InfoExtractor):
},
}, {
# film
'url': 'https://www.playsuisse.ch/watch/808675',
'md5': '818b94c1d2d7c4beef953f12cb8f3e75',
'url': 'https://www.playsuisse.ch/detail/2573198',
'md5': '1f115bb0a5191477b1a5771643a4283d',
'info_dict': {
'id': '808675',
'id': '2573198',
'ext': 'mp4',
'title': 'Der Läufer',
'description': 'md5:9f61265c7e6dcc3e046137a792b275fd',
'duration': 5280,
'title': 'Azor',
'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
'genres': ['Fiction'],
'creators': ['Andreas Fontana'],
'cast': ['Fabrizio Rongione', 'Stéphanie Cléau', 'Gilles Privat', 'Alexandre Trocki'],
'location': 'France; Argentine',
'release_year': 2021,
'duration': 5981,
'thumbnail': 're:https://playsuisse-img.akamaized.net/',
},
}, {
# series (treated as a playlist)
'url': 'https://www.playsuisse.ch/detail/1115687',
'info_dict': {
'description': 'md5:e4a2ae29a8895823045b5c3145a02aa3',
'id': '1115687',
'series': 'They all came out to Montreux',
'title': 'They all came out to Montreux',
'description': 'md5:0fefd8c5b4468a0bb35e916887681520',
'genres': ['Documentary'],
'creators': ['Oliver Murray'],
'location': 'Switzerland',
'release_year': 2021,
},
'playlist': [{
'info_dict': {
@ -120,6 +130,12 @@ class PlaySuisseIE(InfoExtractor):
id
name
description
descriptionLong
year
contentTypes
directors
mainCast
productionCountries
duration
episodeNumber
seasonNumber
@ -215,9 +231,7 @@ def _perform_login(self, username, password):
if not self._ID_TOKEN:
raise ExtractorError('Login failed')
def _get_media_data(self, media_id):
# NOTE In the web app, the "locale" header is used to switch between languages,
# However this doesn't seem to take effect when passing the header here.
def _get_media_data(self, media_id, locale=None):
response = self._download_json(
'https://www.playsuisse.ch/api/graphql',
media_id, data=json.dumps({
@ -225,7 +239,7 @@ def _get_media_data(self, media_id):
'query': self._GRAPHQL_QUERY,
'variables': {'assetId': media_id},
}).encode(),
headers={'Content-Type': 'application/json', 'locale': 'de'})
headers={'Content-Type': 'application/json', 'locale': locale or 'de'})
return response['data']['assetV2']
@ -234,7 +248,7 @@ def _real_extract(self, url):
self.raise_login_required(method='password')
media_id = self._match_id(url)
media_data = self._get_media_data(media_id)
media_data = self._get_media_data(media_id, traverse_obj(parse_qs(url), ('locale', 0)))
info = self._extract_single(media_data)
if media_data.get('episodes'):
info.update({
@ -257,15 +271,22 @@ def _extract_single(self, media_data):
self._merge_subtitles(subs, target=subtitles)
return {
'id': media_data['id'],
'title': media_data.get('name'),
'description': media_data.get('description'),
'thumbnails': thumbnails,
'duration': int_or_none(media_data.get('duration')),
'formats': formats,
'subtitles': subtitles,
'series': media_data.get('seriesName'),
'season_number': int_or_none(media_data.get('seasonNumber')),
'episode': media_data.get('name') if media_data.get('episodeNumber') else None,
'episode_number': int_or_none(media_data.get('episodeNumber')),
**traverse_obj(media_data, {
'id': ('id', {str}),
'title': ('name', {str}),
'description': (('descriptionLong', 'description'), {str}, any),
'genres': ('contentTypes', ..., {str}),
'creators': ('directors', ..., {str}),
'cast': ('mainCast', ..., {str}),
'location': ('productionCountries', ..., {str}, all, {unpack(join_nonempty, delim='; ')}, filter),
'release_year': ('year', {str}, {lambda x: x[:4]}, {int_or_none}),
'duration': ('duration', {int_or_none}),
'series': ('seriesName', {str}),
'season_number': ('seasonNumber', {int_or_none}),
'episode': ('name', {str}, {lambda x: x if media_data['episodeNumber'] is not None else None}),
'episode_number': ('episodeNumber', {int_or_none}),
}),
}

View File

@ -5,11 +5,13 @@
from ..utils import (
OnDemandPagedList,
float_or_none,
int_or_none,
orderedSet,
str_or_none,
str_to_int,
traverse_obj,
unified_timestamp,
url_or_none,
)
from ..utils.traversal import require, traverse_obj
class PodchaserIE(InfoExtractor):
@ -21,24 +23,25 @@ class PodchaserIE(InfoExtractor):
'id': '104365585',
'title': 'Ep. 285 freeze me off',
'description': 'cam ahn',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:https?://.+/.+\.jpg',
'ext': 'mp3',
'categories': ['Comedy'],
'categories': ['Comedy', 'News', 'Politics', 'Arts'],
'tags': ['comedy', 'dark humor'],
'series': 'Cum Town',
'series': 'The Adam Friedland Show Podcast',
'duration': 3708,
'timestamp': 1636531259,
'upload_date': '20211110',
'average_rating': 4.0,
'series_id': '36924',
},
}, {
'url': 'https://www.podchaser.com/podcasts/the-bone-zone-28853',
'info_dict': {
'id': '28853',
'title': 'The Bone Zone',
'description': 'Podcast by The Bone Zone',
'description': r're:The official home of the Bone Zone podcast.+',
},
'playlist_count': 275,
'playlist_mincount': 275,
}, {
'url': 'https://www.podchaser.com/podcasts/sean-carrolls-mindscape-scienc-699349/episodes',
'info_dict': {
@ -51,19 +54,33 @@ class PodchaserIE(InfoExtractor):
@staticmethod
def _parse_episode(episode, podcast):
return {
'id': str(episode.get('id')),
'title': episode.get('title'),
'description': episode.get('description'),
'url': episode.get('audio_url'),
'thumbnail': episode.get('image_url'),
'duration': str_to_int(episode.get('length')),
'timestamp': unified_timestamp(episode.get('air_date')),
'average_rating': float_or_none(episode.get('rating')),
'categories': list(set(traverse_obj(podcast, (('summary', None), 'categories', ..., 'text')))),
'tags': traverse_obj(podcast, ('tags', ..., 'text')),
'series': podcast.get('title'),
}
info = traverse_obj(episode, {
'id': ('id', {int}, {str_or_none}, {require('episode ID')}),
'title': ('title', {str}),
'description': ('description', {str}),
'url': ('audio_url', {url_or_none}),
'thumbnail': ('image_url', {url_or_none}),
'duration': ('length', {int_or_none}),
'timestamp': ('air_date', {unified_timestamp}),
'average_rating': ('rating', {float_or_none}),
})
info.update(traverse_obj(podcast, {
'series': ('title', {str}),
'series_id': ('id', {int}, {str_or_none}),
'categories': (('summary', None), 'categories', ..., 'text', {str}, filter, all, {orderedSet}),
'tags': ('tags', ..., 'text', {str}),
}))
info['vcodec'] = 'none'
if info.get('series_id'):
podcast_slug = traverse_obj(podcast, ('slug', {str})) or 'podcast'
episode_slug = traverse_obj(episode, ('slug', {str})) or 'episode'
info['webpage_url'] = '/'.join((
'https://www.podchaser.com/podcasts',
'-'.join((podcast_slug[:30].rstrip('-'), info['series_id'])),
'-'.join((episode_slug[:30].rstrip('-'), info['id']))))
return info
def _call_api(self, path, *args, **kwargs):
return self._download_json(f'https://api.podchaser.com/{path}', *args, **kwargs)
@ -93,5 +110,5 @@ def _real_extract(self, url):
OnDemandPagedList(functools.partial(self._fetch_page, podcast_id, podcast), self._PAGE_SIZE),
str_or_none(podcast.get('id')), podcast.get('title'), podcast.get('description'))
episode = self._call_api(f'episodes/{episode_id}', episode_id)
episode = self._call_api(f'podcasts/{podcast_id}/episodes/{episode_id}/player_ids', episode_id)
return self._parse_episode(episode, podcast)

View File

@ -22,7 +22,7 @@
)
class PolskieRadioBaseExtractor(InfoExtractor):
class PolskieRadioBaseIE(InfoExtractor):
def _extract_webpage_player_entries(self, webpage, playlist_id, base_data):
media_urls = set()
@ -47,7 +47,7 @@ def _extract_webpage_player_entries(self, webpage, playlist_id, base_data):
yield entry
class PolskieRadioLegacyIE(PolskieRadioBaseExtractor):
class PolskieRadioLegacyIE(PolskieRadioBaseIE):
# legacy sites
IE_NAME = 'polskieradio:legacy'
_VALID_URL = r'https?://(?:www\.)?polskieradio(?:24)?\.pl/\d+/\d+/[Aa]rtykul/(?P<id>\d+)'
@ -127,7 +127,7 @@ def _real_extract(self, url):
return self.playlist_result(entries, playlist_id, title, description)
class PolskieRadioIE(PolskieRadioBaseExtractor):
class PolskieRadioIE(PolskieRadioBaseIE):
# new next.js sites
_VALID_URL = r'https?://(?:[^/]+\.)?(?:polskieradio(?:24)?|radiokierowcow)\.pl/artykul/(?P<id>\d+)'
_TESTS = [{
@ -519,7 +519,7 @@ def _real_extract(self, url):
}
class PolskieRadioPodcastBaseExtractor(InfoExtractor):
class PolskieRadioPodcastBaseIE(InfoExtractor):
_API_BASE = 'https://apipodcasts.polskieradio.pl/api'
def _parse_episode(self, data):
@ -539,7 +539,7 @@ def _parse_episode(self, data):
}
class PolskieRadioPodcastListIE(PolskieRadioPodcastBaseExtractor):
class PolskieRadioPodcastListIE(PolskieRadioPodcastBaseIE):
IE_NAME = 'polskieradio:podcast:list'
_VALID_URL = r'https?://podcasty\.polskieradio\.pl/podcast/(?P<id>\d+)'
_TESTS = [{
@ -578,7 +578,7 @@ def get_page(page_num):
}
class PolskieRadioPodcastIE(PolskieRadioPodcastBaseExtractor):
class PolskieRadioPodcastIE(PolskieRadioPodcastBaseIE):
IE_NAME = 'polskieradio:podcast'
_VALID_URL = r'https?://podcasty\.polskieradio\.pl/track/(?P<id>[a-f\d]{8}(?:-[a-f\d]{4}){4}[a-f\d]{8})'
_TESTS = [{

View File

@ -321,6 +321,27 @@ class RaiPlayIE(RaiBaseIE):
'timestamp': 1348495020,
'upload_date': '20120924',
},
}, {
# checking program_info gives false positive for DRM
'url': 'https://www.raiplay.it/video/2022/10/Ad-ogni-costo---Un-giorno-in-Pretura---Puntata-del-15102022-1dfd1295-ea38-4bac-b51e-f87e2881693b.html',
'md5': '572c6f711b7c5f2d670ba419b4ae3b08',
'info_dict': {
'id': '1dfd1295-ea38-4bac-b51e-f87e2881693b',
'ext': 'mp4',
'title': 'Ad ogni costo - Un giorno in Pretura - Puntata del 15/10/2022',
'alt_title': 'St 2022/23 - Un giorno in pretura - Ad ogni costo',
'description': 'md5:4046d97b2687f74f06a8b8270ba5599f',
'uploader': 'Rai 3',
'duration': 3773.0,
'thumbnail': 'https://www.raiplay.it/dl/img/2022/10/12/1665586539957_2048x2048.png',
'creators': ['Rai 3'],
'series': 'Un giorno in pretura',
'season': '2022/23',
'episode': 'Ad ogni costo',
'timestamp': 1665507240,
'upload_date': '20221011',
'release_year': 2025,
},
}, {
'url': 'http://www.raiplay.it/video/2016/11/gazebotraindesi-efebe701-969c-4593-92f3-285f0d1ce750.html?',
'only_matching': True,
@ -340,8 +361,7 @@ def _real_extract(self, url):
media = self._download_json(
f'{base}.json', video_id, 'Downloading video JSON')
if not self.get_param('allow_unplayable_formats'):
if traverse_obj(media, (('program_info', None), 'rights_management', 'rights', 'drm')):
if traverse_obj(media, ('rights_management', 'rights', 'drm')):
self.report_drm(video_id)
video = media['video']

Some files were not shown because too many files have changed in this diff Show More