1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-18 12:51:27 +00:00

Compare commits

...

72 Commits

Author SHA1 Message Date
github-actions[bot]
9f40cd2896 Release 2023.12.30
Created by: bashonly

:ci skip all :ci run dl
2023-12-30 21:43:13 +00:00
bashonly
f10589e345 [docs] Update youtube-dl merge commit in README.md
Authored by: bashonly
2023-12-30 15:39:06 -06:00
Simon Sawicki
f9fb3ce86e [cleanup] Misc (#8598)
Authored by: bashonly, pukkandan, seproDev, Grub4K

Co-authored-by: bashonly <bashonly@protonmail.com>
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
2023-12-30 22:27:36 +01:00
sepro
5f009a094f [ie/ARD] Overhaul extractors (#8878)
Closes #8731, Closes #6784, Closes #2366, Closes #2975, Closes #8760
Authored by: seproDev
2023-12-30 21:44:32 +01:00
Simon Sawicki
225cf2b830 Fix 2d1d683a54
Authored by: Grub4K
2023-12-26 20:07:09 +01:00
Simon Sawicki
2d1d683a54 [devscripts] run_tests: Create Python script (#8720)
Authored by: Grub4K
2023-12-26 18:30:04 +01:00
Simon Sawicki
65de7d204c Update to ytdl-commit-be008e6 (#8836)
- [utils] Make restricted filenames ignore some Unicode categories (by dirkf)
- [ie/telewebion] Fix extraction (by Grub4K)
- [ie/imgur] Overhaul extractor (by bashonly, Grub4K)
- [ie/EpidemicSound] Add extractor (by Grub4K)

Authored by: bashonly, dirkf, Grub4K

Co-authored-by: bashonly <bashonly@protonmail.com>
2023-12-26 01:40:24 +01:00
kclauhk
c39358a54b [ie/Facebook] Fix Memories extraction (#8681)
- Support group /posts/ URLs
- Raise a proper error message if no formats are found

Closes #8669
Authored by: kclauhk
2023-12-24 23:43:35 +01:00
Lars Strojny
1f8bd8eba8 [ie/ARDBetaMediathek] Fix series extraction (#8687)
Closes #7666
Authored by: lstrojny
2023-12-24 23:38:21 +01:00
Simon Sawicki
00cdda4f6f [core] Fix format selection parse error for CPython 3.12 (#8797)
Authored by: Grub4K
2023-12-24 22:09:01 +01:00
bashonly
116c268438 [ie/twitter] Work around API rate-limit (#8825)
Closes #8762
Authored by: bashonly
2023-12-24 16:41:28 +00:00
bashonly
e7d22348e7 [ie/twitter] Prioritize m3u8 formats (#8826)
Closes #8117
Authored by: bashonly
2023-12-24 16:40:50 +00:00
bashonly
50eaea9fd7 [ie/instagram] Fix stories extraction (#8843)
Closes #8290
Authored by: bashonly
2023-12-24 16:40:03 +00:00
bashonly
f45c4efcd9 [ie/litv] Fix premium content extraction (#8842)
Closes #8654
Authored by: bashonly
2023-12-24 16:33:16 +00:00
Simon Sawicki
13b3cb3c2b [ci] Run core tests only for core changes (#8841)
Authored by: Grub4K
2023-12-24 00:11:10 +01:00
Nicolas Dato
0d531c35ec [ie/RudoVideo] Add extractor (#8664)
Authored by: nicodato
2023-12-22 22:52:07 +01:00
barsnick
bc4ab17b38 [cleanup] Fix spelling of IE_NAME (#8810)
Authored by: barsnick
2023-12-22 02:32:29 +01:00
bashonly
632b8ee54e [core] Release workflow and Updater cleanup (#8640)
- Only use trusted publishing with PyPI and remove support for PyPI tokens from release workflow
- Clean up improper actions syntax in the build workflow inputs
- Refactor Updater to allow for consistent unit testing with `UPDATE_SOURCES`

Authored by: bashonly
2023-12-21 21:06:26 +00:00
barsnick
c919b68f7e [ie/bbc] Extract more formats (#8321)
Closes #4902
Authored by: barsnick, dirkf
2023-12-21 20:47:32 +00:00
bashonly
19741ab8a4 [ie/bbc] Fix JSON parsing bug
Authored by: bashonly
2023-12-21 14:46:00 -06:00
bashonly
37755a037e [test:networking] Update tests for OpenSSL 3.2 (#8814)
Authored by: bashonly
2023-12-20 19:03:54 +00:00
coletdjnz
196eb0fe77 [networking] Strip whitespace around header values (#8802)
Fixes https://github.com/yt-dlp/yt-dlp/issues/8729
Authored by: coletdjnz
2023-12-20 19:15:38 +13:00
Mozi
db8b4edc7d [ie/JoqrAg] Add extractor (#8384)
Authored by: pzhlkj6612
2023-12-19 14:21:47 +00:00
bashonly
1c54a98e19 [ie/twitter] Extract stale tweets (#8724)
Closes #8691
Authored by: bashonly
2023-12-19 13:24:55 +00:00
Simon Sawicki
00a3e47bf5 [ie/bundestag] Add extractor (#8783)
Authored by: Grub4K
2023-12-18 21:32:08 +01:00
Amir Y. Perehodnik
c5f01bf7d4 [ie/Maariv] Add extractor (#8331)
Authored by: amir16yp
2023-12-18 16:52:43 +01:00
Tristan Charpentier
c91af948e4 [ie/RinseFM] Add extractor (#8778)
Authored by: hashFactory
2023-12-17 14:07:55 +00:00
Pandey Ganesha
6b5d93b0b0 [ie/youtube] Fix like_count extraction (#8763)
Closes #8759
Authored by: Ganesh910
2023-12-13 07:04:12 +00:00
pukkandan
298230e550 [webvtt] Fix 15f22b4880 2023-12-13 05:11:45 +05:30
Mozi
d5d1517e7d [ie/eplus] Add login support and DRM detection (#8661)
Authored by: pzhlkj6612
2023-12-12 00:29:36 +00:00
trainman261
7e09c147fd [ie/theplatform] Extract more metadata (#8635)
Authored by: trainman261
2023-12-12 00:00:35 +00:00
Benjamin Krausse
e370f9ec36 [ie] Add media_type field
Authored by: trainman261
2023-12-11 17:57:41 -06:00
SirElderling
b1a1ec1540 [ie/bitchute] Fix and improve metadata extraction (#8507)
Closes #8492
Authored by: SirElderling
2023-12-11 23:56:01 +00:00
Simon Sawicki
0b6f829b1d [utils] traverse_obj: Move is_user_input into output template (#8673)
Authored by: Grub4K
2023-12-06 21:46:45 +01:00
Simon Sawicki
f98a3305eb [ie/pr0gramm] Support variant formats and subtitles (#8674)
Authored by: Grub4K
2023-12-06 21:44:54 +01:00
sepro
04a5e06350 [ie/ondemandkorea] Fix upgraded format extraction (#8677)
Closes #8675
Authored by: seproDev
2023-12-06 18:58:00 +01:00
Nicolas Cisco
b03c89309e [ie/mediastream] Fix authenticated format extraction (#8657)
Authored by: NickCis
2023-12-06 18:55:38 +01:00
Pierrick Guillaume
71f28097fe [ie/francetv] Improve metadata extraction (#8409)
Authored by: Fymyte
2023-12-06 16:10:11 +01:00
pukkandan
044886c220 [ie/youtube] Return empty playlist when channel/tab has no videos
Closes #8634
2023-12-06 03:44:13 +05:30
pukkandan
993edd3f6e [outtmpl] Support multiplication
Related: #8683
2023-12-06 03:44:11 +05:30
OIRNOIR
6a9c7a2b52 [ie/youtube] Support cf.piped.video (#8514)
Authored by: OIRNOIR
Closes #8457
2023-11-29 18:18:58 +05:30
pukkandan
a174c453ee Let read_stdin obey --quiet
Closes #8668
2023-11-29 05:48:40 +05:30
TSRBerry
15f22b4880 [webvtt] Allow spaces before newlines for CueBlock (#7681)
Closes #7453

Ref: https://www.w3.org/TR/webvtt1/#webvtt-cue-block
2023-11-29 04:50:06 +05:30
sepro
9751a457cf [cleanup] Remove dead extractors (#8604)
Closes #1609, Closes #3232, Closes #4763, Closes #6026, Closes #6322, Closes #7912
Authored by: seproDev
2023-11-26 03:09:59 +00:00
bashonly
5a230233d6 [ie/box] Fix formats extraction (#8649)
Closes #5098
Authored by: bashonly
2023-11-26 02:50:23 +00:00
bashonly
4903f452b6 [ie/bfmtv] Fix extractors (#8651)
Closes #8425
Authored by: bashonly
2023-11-26 02:49:18 +00:00
bashonly
ff2fde1b8f [ie/TwitCastingUser] Fix extraction (#8650)
Closes #8653
Authored by: bashonly
2023-11-26 02:47:48 +00:00
bashonly
deeb13eae8 [pp/FFmpegMetadata] Embed stream metadata in single format downloads (#8647)
Closes #8568
Authored by: bashonly
2023-11-26 02:40:09 +00:00
bashonly
bb5a54e6db [ie/youtube] Improve detection of faulty HLS formats (#8646)
Closes #7747
Authored by: bashonly
2023-11-26 02:21:29 +00:00
sepro
628fa244bb [ie/floatplane] Add extractors (#8639)
Closes #5877, Closes #5912
Authored by: seproDev
2023-11-26 02:20:10 +00:00
kclauhk
9cafb9ff17 [ie/facebook] Improve subtitles extraction (#8296)
Authored by: kclauhk
2023-11-26 02:17:16 +00:00
sepro
1732eccc0a [core] Parse release_year from release_date (#8524)
Closes #7263
Authored by: seproDev
2023-11-26 02:12:05 +00:00
pk
a0b19d319a [core] Support NO_COLOR environment variable (#8385)
Authored by: prettykool, Grub4K
2023-11-20 23:43:52 +01:00
middlingphys
cc07f5cc85 [ie/abematv] Fix season metadata (#8607)
Authored by: middlingphys
2023-11-20 22:39:12 +00:00
coletdjnz
ccfd70f4c2 [rh:websockets] Migrate websockets to networking framework (#7720)
* Adds a basic WebSocket framework
* Introduces new minimum `websockets` version of 12.0
* Deprecates `WebSocketsWrapper`

Fixes https://github.com/yt-dlp/yt-dlp/issues/8439

Authored by: coletdjnz
2023-11-20 08:04:04 +00:00
sepro
45d82be65f [ie/nebula] Overhaul extractors (#8566)
Closes #4300, Closes #5814, Closes #7588, Closes #6334, Closes #6538
Authored by: elyse0, pukkandan, seproDev

Co-authored-by: Elyse <26639800+elyse0@users.noreply.github.com>
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
2023-11-20 01:03:33 +00:00
Safouane Aarab
3237f8ba29 [ie/allstar] Add extractors (#8274)
Closes #6917
Authored by: S-Aarab
2023-11-20 00:07:19 +00:00
Kyraminol Endyeran
1725e943b0 [ie/vvvvid] Set user-agent to fix extraction (#8615)
Authored by: Kyraminol
2023-11-19 21:30:21 +00:00
c-basalt
9f09bdcfcb [ie/bilibili] Support courses and interactive videos (#8343)
Closes #6135, Closes #8428
Authored by: c-basalt
2023-11-19 21:26:46 +00:00
Simon Sawicki
f124fa4588 [ci] Concurrency optimizations (#8614)
Authored by: Grub4K
2023-11-19 16:05:13 +01:00
JC-Chung
585d0ed9ab [ie/twitcasting] Detect livestreams via API and show page (#8601)
Authored by: JC-Chung, bashonly
2023-11-18 22:14:45 +00:00
SirElderling
1fa3f24d4b [ie/theguardian] Add extractors (#8535)
Closes #8520
Authored by: SirElderling
2023-11-18 21:54:00 +00:00
sepro
ddb2d7588b [ie] Extract from media elements in SMIL manifests (#8504)
Authored by: seproDev
2023-11-18 21:51:18 +00:00
qbnu
f223b1b078 [ie/vocaroo] Do not use deprecated getheader (#8606)
Authored by: qbnu
2023-11-18 21:49:23 +00:00
Berkay
6fe82491ed [ie/twitter:broadcast] Extract concurrent_view_count (#8600)
Authored by: sonmezberkay
2023-11-18 21:46:22 +00:00
sepro
34df1c1f60 [ie/vidly] Add extractor (#8612)
Authored by: seproDev
2023-11-18 20:28:25 +00:00
Simon Sawicki
1d24da6c89 [ie/nintendo] Fix Nintendo Direct extraction (#8609)
Authored by: Grub4K
2023-11-18 21:04:42 +01:00
Elan Ruusamäe
66a0127d45 [ie/duoplay] Add extractor (#8542)
Authored by: glensc
2023-11-16 22:46:29 +00:00
Raphaël Droz
3f90813f06 [ie/altcensored] Add extractor (#8291)
Authored by: drzraf
2023-11-16 22:24:12 +00:00
Ha Tien Loi
64de1a4c25 [ie/zingmp3] Add support for radio and podcasts (#7189)
Authored by: hatienl0i261299
2023-11-16 22:08:00 +00:00
sepro
f96ab86cd8 [ie/drtv] Set default ext for m3u8 formats (#8590)
Closes #8589
Authored by: seproDev
2023-11-16 20:46:13 +00:00
bashonly
f4b95acafc Remove Python 3.7 support (#8361)
Closes #7803
Authored by: bashonly
2023-11-16 18:39:00 +00:00
232 changed files with 5421 additions and 11804 deletions

View File

@@ -80,12 +80,12 @@ on:
default: true default: true
type: boolean type: boolean
origin: origin:
description: . description: Origin
required: false required: false
default: '' default: 'current repo'
type: choice type: choice
options: options:
- '' - 'current repo'
permissions: permissions:
contents: read contents: read
@@ -99,7 +99,7 @@ jobs:
- name: Process origin - name: Process origin
id: process_origin id: process_origin
run: | run: |
echo "origin=${{ inputs.origin || github.repository }}" >> "$GITHUB_OUTPUT" echo "origin=${{ inputs.origin == 'current repo' && github.repository || inputs.origin }}" | tee "$GITHUB_OUTPUT"
unix: unix:
needs: process needs: process
@@ -377,8 +377,8 @@ jobs:
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- uses: actions/setup-python@v4 - uses: actions/setup-python@v4
with: # 3.7 is used for Vista support. See https://github.com/yt-dlp/yt-dlp/issues/390 with:
python-version: "3.7" python-version: "3.8"
architecture: "x86" architecture: "x86"
- name: Install Requirements - name: Install Requirements
run: | run: |
@@ -436,7 +436,16 @@ jobs:
run: | run: |
cat >> _update_spec << EOF cat >> _update_spec << EOF
# This file is used for regulating self-update # This file is used for regulating self-update
lock 2022.08.18.36 .+ Python 3.6 lock 2022.08.18.36 .+ Python 3\.6
lock 2023.11.16 (?!win_x86_exe).+ Python 3\.7
lock 2023.11.16 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp 2022.08.18.36 .+ Python 3\.6
lockV2 yt-dlp/yt-dlp 2023.11.16 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp 2023.11.16 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp-nightly-builds 2023.11.15.232826 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp-nightly-builds 2023.11.15.232826 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp-master-builds 2023.11.15.232812 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp-master-builds 2023.11.15.232812 win_x86_exe .+ Windows-(?:Vista|2008Server)
EOF EOF
- name: Sign checksum files - name: Sign checksum files

View File

@@ -1,8 +1,32 @@
name: Core Tests name: Core Tests
on: [push, pull_request] on:
push:
paths:
- .github/**
- devscripts/**
- test/**
- yt_dlp/**.py
- '!yt_dlp/extractor/*.py'
- yt_dlp/extractor/__init__.py
- yt_dlp/extractor/common.py
- yt_dlp/extractor/extractors.py
pull_request:
paths:
- .github/**
- devscripts/**
- test/**
- yt_dlp/**.py
- '!yt_dlp/extractor/*.py'
- yt_dlp/extractor/__init__.py
- yt_dlp/extractor/common.py
- yt_dlp/extractor/extractors.py
permissions: permissions:
contents: read contents: read
concurrency:
group: core-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs: jobs:
tests: tests:
name: Core Tests name: Core Tests
@@ -12,30 +36,26 @@ jobs:
fail-fast: false fail-fast: false
matrix: matrix:
os: [ubuntu-latest] os: [ubuntu-latest]
# CPython 3.11 is in quick-test # CPython 3.8 is in quick-test
python-version: ['3.8', '3.9', '3.10', '3.12', pypy-3.7, pypy-3.8, pypy-3.10] python-version: ['3.9', '3.10', '3.11', '3.12', pypy-3.8, pypy-3.10]
run-tests-ext: [sh]
include: include:
# atleast one of each CPython/PyPy tests must be in windows # atleast one of each CPython/PyPy tests must be in windows
- os: windows-latest - os: windows-latest
python-version: '3.7' python-version: '3.8'
run-tests-ext: bat
- os: windows-latest - os: windows-latest
python-version: '3.12' python-version: '3.12'
run-tests-ext: bat
- os: windows-latest - os: windows-latest
python-version: pypy-3.9 python-version: pypy-3.9
run-tests-ext: bat
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} - name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4 uses: actions/setup-python@v4
with: with:
python-version: ${{ matrix.python-version }} python-version: ${{ matrix.python-version }}
- name: Install dependencies - name: Install test requirements
run: pip install pytest -r requirements.txt run: pip install pytest -r requirements.txt
- name: Run tests - name: Run tests
continue-on-error: False continue-on-error: False
run: | run: |
python3 -m yt_dlp -v || true # Print debug head python3 -m yt_dlp -v || true # Print debug head
./devscripts/run_tests.${{ matrix.run-tests-ext }} core python3 ./devscripts/run_tests.py core

View File

@@ -15,10 +15,10 @@ jobs:
with: with:
python-version: 3.9 python-version: 3.9
- name: Install test requirements - name: Install test requirements
run: pip install pytest run: pip install pytest -r requirements.txt
- name: Run tests - name: Run tests
continue-on-error: true continue-on-error: true
run: ./devscripts/run_tests.sh download run: python3 ./devscripts/run_tests.py download
full: full:
name: Full Download Tests name: Full Download Tests
@@ -28,24 +28,21 @@ jobs:
fail-fast: true fail-fast: true
matrix: matrix:
os: [ubuntu-latest] os: [ubuntu-latest]
python-version: ['3.7', '3.10', '3.12', pypy-3.7, pypy-3.8, pypy-3.10] python-version: ['3.10', '3.11', '3.12', pypy-3.8, pypy-3.10]
run-tests-ext: [sh]
include: include:
# atleast one of each CPython/PyPy tests must be in windows # atleast one of each CPython/PyPy tests must be in windows
- os: windows-latest - os: windows-latest
python-version: '3.8' python-version: '3.8'
run-tests-ext: bat
- os: windows-latest - os: windows-latest
python-version: pypy-3.9 python-version: pypy-3.9
run-tests-ext: bat
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} - name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4 uses: actions/setup-python@v4
with: with:
python-version: ${{ matrix.python-version }} python-version: ${{ matrix.python-version }}
- name: Install pytest - name: Install test requirements
run: pip install pytest run: pip install pytest -r requirements.txt
- name: Run tests - name: Run tests
continue-on-error: true continue-on-error: true
run: ./devscripts/run_tests.${{ matrix.run-tests-ext }} download run: python3 ./devscripts/run_tests.py download

View File

@@ -10,16 +10,16 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- name: Set up Python 3.11 - name: Set up Python 3.8
uses: actions/setup-python@v4 uses: actions/setup-python@v4
with: with:
python-version: '3.11' python-version: '3.8'
- name: Install test requirements - name: Install test requirements
run: pip install pytest pycryptodomex run: pip install pytest -r requirements.txt
- name: Run tests - name: Run tests
run: | run: |
python3 -m yt_dlp -v || true python3 -m yt_dlp -v || true
./devscripts/run_tests.sh core python3 ./devscripts/run_tests.py core
flake8: flake8:
name: Linter name: Linter
if: "!contains(github.event.head_commit.message, 'ci skip all')" if: "!contains(github.event.head_commit.message, 'ci skip all')"

View File

@@ -10,7 +10,6 @@ on:
- "pyinst.py" - "pyinst.py"
concurrency: concurrency:
group: release-master group: release-master
cancel-in-progress: true
permissions: permissions:
contents: read contents: read

View File

@@ -64,7 +64,6 @@ jobs:
target_tag: ${{ steps.setup_variables.outputs.target_tag }} target_tag: ${{ steps.setup_variables.outputs.target_tag }}
pypi_project: ${{ steps.setup_variables.outputs.pypi_project }} pypi_project: ${{ steps.setup_variables.outputs.pypi_project }}
pypi_suffix: ${{ steps.setup_variables.outputs.pypi_suffix }} pypi_suffix: ${{ steps.setup_variables.outputs.pypi_suffix }}
pypi_token: ${{ steps.setup_variables.outputs.pypi_token }}
head_sha: ${{ steps.get_target.outputs.head_sha }} head_sha: ${{ steps.get_target.outputs.head_sha }}
steps: steps:
@@ -153,7 +152,6 @@ jobs:
${{ !!secrets[format('{0}_archive_repo_token', env.target_repo)] }} || fallback_token ${{ !!secrets[format('{0}_archive_repo_token', env.target_repo)] }} || fallback_token
pypi_project='${{ vars[format('{0}_pypi_project', env.target_repo)] }}' pypi_project='${{ vars[format('{0}_pypi_project', env.target_repo)] }}'
pypi_suffix='${{ vars[format('{0}_pypi_suffix', env.target_repo)] }}' pypi_suffix='${{ vars[format('{0}_pypi_suffix', env.target_repo)] }}'
${{ !secrets[format('{0}_pypi_token', env.target_repo)] }} || pypi_token='${{ env.target_repo }}_pypi_token'
fi fi
else else
target_tag="${source_tag:-${version}}" target_tag="${source_tag:-${version}}"
@@ -163,7 +161,6 @@ jobs:
${{ !!secrets[format('{0}_archive_repo_token', env.source_repo)] }} || fallback_token ${{ !!secrets[format('{0}_archive_repo_token', env.source_repo)] }} || fallback_token
pypi_project='${{ vars[format('{0}_pypi_project', env.source_repo)] }}' pypi_project='${{ vars[format('{0}_pypi_project', env.source_repo)] }}'
pypi_suffix='${{ vars[format('{0}_pypi_suffix', env.source_repo)] }}' pypi_suffix='${{ vars[format('{0}_pypi_suffix', env.source_repo)] }}'
${{ !secrets[format('{0}_pypi_token', env.source_repo)] }} || pypi_token='${{ env.source_repo }}_pypi_token'
else else
target_repo='${{ github.repository }}' target_repo='${{ github.repository }}'
fi fi
@@ -172,13 +169,6 @@ jobs:
if [[ "${target_repo}" == '${{ github.repository }}' ]] && ${{ !inputs.prerelease }}; then if [[ "${target_repo}" == '${{ github.repository }}' ]] && ${{ !inputs.prerelease }}; then
pypi_project='${{ vars.PYPI_PROJECT }}' pypi_project='${{ vars.PYPI_PROJECT }}'
fi fi
if [[ -z "${pypi_token}" && "${pypi_project}" ]]; then
if ${{ !secrets.PYPI_TOKEN }}; then
pypi_token=OIDC
else
pypi_token=PYPI_TOKEN
fi
fi
echo "::group::Output variables" echo "::group::Output variables"
cat << EOF | tee -a "$GITHUB_OUTPUT" cat << EOF | tee -a "$GITHUB_OUTPUT"
@@ -189,7 +179,6 @@ jobs:
target_tag=${target_tag} target_tag=${target_tag}
pypi_project=${pypi_project} pypi_project=${pypi_project}
pypi_suffix=${pypi_suffix} pypi_suffix=${pypi_suffix}
pypi_token=${pypi_token}
EOF EOF
echo "::endgroup::" echo "::endgroup::"
@@ -286,18 +275,7 @@ jobs:
python devscripts/set-variant.py pip -M "You installed yt-dlp with pip or using the wheel from PyPi; Use that to update" python devscripts/set-variant.py pip -M "You installed yt-dlp with pip or using the wheel from PyPi; Use that to update"
python setup.py sdist bdist_wheel python setup.py sdist bdist_wheel
- name: Publish to PyPI via token - name: Publish to PyPI
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets[needs.prepare.outputs.pypi_token] }}
if: |
needs.prepare.outputs.pypi_token != 'OIDC' && env.TWINE_PASSWORD
run: |
twine upload dist/*
- name: Publish to PyPI via trusted publishing
if: |
needs.prepare.outputs.pypi_token == 'OIDC'
uses: pypa/gh-action-pypi-publish@release/v1 uses: pypa/gh-action-pypi-publish@release/v1
with: with:
verbose: true verbose: true

View File

@@ -140,12 +140,9 @@ To run yt-dlp as a developer, you don't need to build anything either. Simply ex
python -m yt_dlp python -m yt_dlp
To run the test, simply invoke your favorite test runner, or execute a test file directly; any of the following work: To run all the available core tests, use:
python -m unittest discover python devscripts/run_tests.py
python test/test_download.py
nosetests
pytest
See item 6 of [new extractor tutorial](#adding-support-for-a-new-site) for how to run extractor specific test cases. See item 6 of [new extractor tutorial](#adding-support-for-a-new-site) for how to run extractor specific test cases.
@@ -187,15 +184,21 @@ After you have ensured this site is distributing its content legally, you can fo
'url': 'https://yourextractor.com/watch/42', 'url': 'https://yourextractor.com/watch/42',
'md5': 'TODO: md5 sum of the first 10241 bytes of the video file (use --test)', 'md5': 'TODO: md5 sum of the first 10241 bytes of the video file (use --test)',
'info_dict': { 'info_dict': {
# For videos, only the 'id' and 'ext' fields are required to RUN the test:
'id': '42', 'id': '42',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Video title goes here', # Then if the test run fails, it will output the missing/incorrect fields.
'thumbnail': r're:^https?://.*\.jpg$', # Properties can be added as:
# TODO more properties, either as: # * A value, e.g.
# * A value # 'title': 'Video title goes here',
# * MD5 checksum; start the string with md5: # * MD5 checksum; start the string with 'md5:', e.g.
# * A regular expression; start the string with re: # 'description': 'md5:098f6bcd4621d373cade4e832627b4f6',
# * Any Python type, e.g. int or float # * A regular expression; start the string with 're:', e.g.
# 'thumbnail': r're:^https?://.*\.jpg$',
# * A count of elements in a list; start the string with 'count:', e.g.
# 'tags': 'count:10',
# * Any Python type, e.g.
# 'view_count': int,
} }
}] }]
@@ -215,14 +218,14 @@ After you have ensured this site is distributing its content legally, you can fo
} }
``` ```
1. Add an import in [`yt_dlp/extractor/_extractors.py`](yt_dlp/extractor/_extractors.py). Note that the class name must end with `IE`. 1. Add an import in [`yt_dlp/extractor/_extractors.py`](yt_dlp/extractor/_extractors.py). Note that the class name must end with `IE`.
1. Run `python test/test_download.py TestDownload.test_YourExtractor` (note that `YourExtractor` doesn't end with `IE`). This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all` 1. Run `python devscripts/run_tests.py YourExtractor`. This *may fail* at first, but you can continually re-run it until you're done. Upon failure, it will output the missing fields and/or correct values which you can copy. If you decide to add more than one test, the tests will then be named `YourExtractor`, `YourExtractor_1`, `YourExtractor_2`, etc. Note that tests with an `only_matching` key in the test's dict are not included in the count. You can also run all the tests in one go with `YourExtractor_all`
1. Make sure you have atleast one test for your extractor. Even if all videos covered by the extractor are expected to be inaccessible for automated testing, tests should still be added with a `skip` parameter indicating why the particular test is disabled from running. 1. Make sure you have at least one test for your extractor. Even if all videos covered by the extractor are expected to be inaccessible for automated testing, tests should still be added with a `skip` parameter indicating why the particular test is disabled from running.
1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L119-L440). Add tests and code for as many as you want. 1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L119-L440). Add tests and code for as many as you want.
1. Make sure your code follows [yt-dlp coding conventions](#yt-dlp-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart): 1. Make sure your code follows [yt-dlp coding conventions](#yt-dlp-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart):
$ flake8 yt_dlp/extractor/yourextractor.py $ flake8 yt_dlp/extractor/yourextractor.py
1. Make sure your code works under all [Python](https://www.python.org/) versions supported by yt-dlp, namely CPython and PyPy for Python 3.7 and above. Backward compatibility is not required for even older versions of Python. 1. Make sure your code works under all [Python](https://www.python.org/) versions supported by yt-dlp, namely CPython and PyPy for Python 3.8 and above. Backward compatibility is not required for even older versions of Python.
1. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files, [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this: 1. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files, [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:
$ git add yt_dlp/extractor/_extractors.py $ git add yt_dlp/extractor/_extractors.py

View File

@@ -528,3 +528,17 @@ almx
elivinsky elivinsky
starius starius
TravisDupes TravisDupes
amir16yp
Fymyte
Ganesh910
hashFactory
kclauhk
Kyraminol
lstrojny
middlingphys
NickCis
nicodato
prettykool
S-Aarab
sonmezberkay
TSRBerry

View File

@@ -4,6 +4,93 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master # To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
--> -->
### 2023.12.30
#### Core changes
- [Fix format selection parse error for CPython 3.12](https://github.com/yt-dlp/yt-dlp/commit/00cdda4f6fe18712ced13dbc64b7ea10f323e268) ([#8797](https://github.com/yt-dlp/yt-dlp/issues/8797)) by [Grub4K](https://github.com/Grub4K)
- [Let `read_stdin` obey `--quiet`](https://github.com/yt-dlp/yt-dlp/commit/a174c453ee1e853c584ceadeac17eef2bd433dc5) by [pukkandan](https://github.com/pukkandan)
- [Merged with youtube-dl be008e6](https://github.com/yt-dlp/yt-dlp/commit/65de7d204ce88c0225df1321060304baab85dbd8) by [bashonly](https://github.com/bashonly), [dirkf](https://github.com/dirkf), [Grub4K](https://github.com/Grub4K)
- [Parse `release_year` from `release_date`](https://github.com/yt-dlp/yt-dlp/commit/1732eccc0a40256e076bf0435a29f0f1d8419280) ([#8524](https://github.com/yt-dlp/yt-dlp/issues/8524)) by [seproDev](https://github.com/seproDev)
- [Release workflow and Updater cleanup](https://github.com/yt-dlp/yt-dlp/commit/632b8ee54eb2df8ac6e20746a0bd95b7ebb053aa) ([#8640](https://github.com/yt-dlp/yt-dlp/issues/8640)) by [bashonly](https://github.com/bashonly)
- [Remove Python 3.7 support](https://github.com/yt-dlp/yt-dlp/commit/f4b95acafcd69a50040730dfdf732e797278fdcc) ([#8361](https://github.com/yt-dlp/yt-dlp/issues/8361)) by [bashonly](https://github.com/bashonly)
- [Support `NO_COLOR` environment variable](https://github.com/yt-dlp/yt-dlp/commit/a0b19d319a6ce8b7059318fa17a34b144fde1785) ([#8385](https://github.com/yt-dlp/yt-dlp/issues/8385)) by [Grub4K](https://github.com/Grub4K), [prettykool](https://github.com/prettykool)
- **outtmpl**: [Support multiplication](https://github.com/yt-dlp/yt-dlp/commit/993edd3f6e17e966c763bc86dc34125445cec6b6) by [pukkandan](https://github.com/pukkandan)
- **utils**: `traverse_obj`: [Move `is_user_input` into output template](https://github.com/yt-dlp/yt-dlp/commit/0b6f829b1dfda15d3c1d7d1fbe4ea6102c26dd24) ([#8673](https://github.com/yt-dlp/yt-dlp/issues/8673)) by [Grub4K](https://github.com/Grub4K)
- **webvtt**: [Allow spaces before newlines for CueBlock](https://github.com/yt-dlp/yt-dlp/commit/15f22b4880b6b3f71f350c64d70976ae65b9f1ca) ([#7681](https://github.com/yt-dlp/yt-dlp/issues/7681)) by [TSRBerry](https://github.com/TSRBerry) (With fixes in [298230e](https://github.com/yt-dlp/yt-dlp/commit/298230e550886b746c266724dd701d842ca2696e) by [pukkandan](https://github.com/pukkandan))
#### Extractor changes
- [Add `media_type` field](https://github.com/yt-dlp/yt-dlp/commit/e370f9ec36972d06100a3db893b397bfc1b07b4d) by [trainman261](https://github.com/trainman261)
- [Extract from `media` elements in SMIL manifests](https://github.com/yt-dlp/yt-dlp/commit/ddb2d7588bea48bae965dbfabe6df6550c9d3d43) ([#8504](https://github.com/yt-dlp/yt-dlp/issues/8504)) by [seproDev](https://github.com/seproDev)
- **abematv**: [Fix season metadata](https://github.com/yt-dlp/yt-dlp/commit/cc07f5cc85d9e2a6cd0bedb9d961665eea0d6047) ([#8607](https://github.com/yt-dlp/yt-dlp/issues/8607)) by [middlingphys](https://github.com/middlingphys)
- **allstar**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/3237f8ba29fe13bf95ff42b1e48b5b5109715feb) ([#8274](https://github.com/yt-dlp/yt-dlp/issues/8274)) by [S-Aarab](https://github.com/S-Aarab)
- **altcensored**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3f90813f0617e0d21302398010de7496c9ae36aa) ([#8291](https://github.com/yt-dlp/yt-dlp/issues/8291)) by [drzraf](https://github.com/drzraf)
- **ard**: [Overhaul extractors](https://github.com/yt-dlp/yt-dlp/commit/5f009a094f0e8450792b097c4c8273622778052d) ([#8878](https://github.com/yt-dlp/yt-dlp/issues/8878)) by [seproDev](https://github.com/seproDev)
- **ardbetamediathek**: [Fix series extraction](https://github.com/yt-dlp/yt-dlp/commit/1f8bd8eba82ba10ddb49ee7cc0be4540dab103d5) ([#8687](https://github.com/yt-dlp/yt-dlp/issues/8687)) by [lstrojny](https://github.com/lstrojny)
- **bbc**
- [Extract more formats](https://github.com/yt-dlp/yt-dlp/commit/c919b68f7e79ea5010f75f648d3c9e45405a8011) ([#8321](https://github.com/yt-dlp/yt-dlp/issues/8321)) by [barsnick](https://github.com/barsnick), [dirkf](https://github.com/dirkf)
- [Fix JSON parsing bug](https://github.com/yt-dlp/yt-dlp/commit/19741ab8a401ec64d5e84fdbfcfb141d105e7bc8) by [bashonly](https://github.com/bashonly)
- **bfmtv**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/4903f452b68efb62dadf22e81be8c7934fc743e7) ([#8651](https://github.com/yt-dlp/yt-dlp/issues/8651)) by [bashonly](https://github.com/bashonly)
- **bilibili**: [Support courses and interactive videos](https://github.com/yt-dlp/yt-dlp/commit/9f09bdcfcb8e2b4b2decdc30d35d34b993bc7a94) ([#8343](https://github.com/yt-dlp/yt-dlp/issues/8343)) by [c-basalt](https://github.com/c-basalt)
- **bitchute**: [Fix and improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/b1a1ec1540605d2ea7abdb63336ffb1c56bf6316) ([#8507](https://github.com/yt-dlp/yt-dlp/issues/8507)) by [SirElderling](https://github.com/SirElderling)
- **box**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/5a230233d6fce06f4abd1fce0dc92b948e6f780b) ([#8649](https://github.com/yt-dlp/yt-dlp/issues/8649)) by [bashonly](https://github.com/bashonly)
- **bundestag**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/00a3e47bf5440c96025a76e08337ff2a475ed83e) ([#8783](https://github.com/yt-dlp/yt-dlp/issues/8783)) by [Grub4K](https://github.com/Grub4K)
- **drtv**: [Set default ext for m3u8 formats](https://github.com/yt-dlp/yt-dlp/commit/f96ab86cd837b1b5823baa87d144e15322ee9298) ([#8590](https://github.com/yt-dlp/yt-dlp/issues/8590)) by [seproDev](https://github.com/seproDev)
- **duoplay**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/66a0127d45033c698bdbedf162cddc55d9e7b906) ([#8542](https://github.com/yt-dlp/yt-dlp/issues/8542)) by [glensc](https://github.com/glensc)
- **eplus**: [Add login support and DRM detection](https://github.com/yt-dlp/yt-dlp/commit/d5d1517e7d838500800d193ac3234b06e89654cd) ([#8661](https://github.com/yt-dlp/yt-dlp/issues/8661)) by [pzhlkj6612](https://github.com/pzhlkj6612)
- **facebook**
- [Fix Memories extraction](https://github.com/yt-dlp/yt-dlp/commit/c39358a54bc6675ae0c50b81024e5a086e41656a) ([#8681](https://github.com/yt-dlp/yt-dlp/issues/8681)) by [kclauhk](https://github.com/kclauhk)
- [Improve subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/9cafb9ff17e14475a35c9a58b5bb010c86c9db4b) ([#8296](https://github.com/yt-dlp/yt-dlp/issues/8296)) by [kclauhk](https://github.com/kclauhk)
- **floatplane**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/628fa244bbce2ad39775a5959e99588f30cac152) ([#8639](https://github.com/yt-dlp/yt-dlp/issues/8639)) by [seproDev](https://github.com/seproDev)
- **francetv**: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/71f28097fec1c9e029f74b68a4eadc8915399840) ([#8409](https://github.com/yt-dlp/yt-dlp/issues/8409)) by [Fymyte](https://github.com/Fymyte)
- **instagram**: [Fix stories extraction](https://github.com/yt-dlp/yt-dlp/commit/50eaea9fd7787546b53660e736325fa31c77765d) ([#8843](https://github.com/yt-dlp/yt-dlp/issues/8843)) by [bashonly](https://github.com/bashonly)
- **joqrag**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/db8b4edc7d0bd27da462f6fe82ff6e13e3d68a04) ([#8384](https://github.com/yt-dlp/yt-dlp/issues/8384)) by [pzhlkj6612](https://github.com/pzhlkj6612)
- **litv**: [Fix premium content extraction](https://github.com/yt-dlp/yt-dlp/commit/f45c4efcd928a173e1300a8f1ce4258e70c969b1) ([#8842](https://github.com/yt-dlp/yt-dlp/issues/8842)) by [bashonly](https://github.com/bashonly)
- **maariv**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/c5f01bf7d4b9426c87c3f8248de23934a56579e0) ([#8331](https://github.com/yt-dlp/yt-dlp/issues/8331)) by [amir16yp](https://github.com/amir16yp)
- **mediastream**: [Fix authenticated format extraction](https://github.com/yt-dlp/yt-dlp/commit/b03c89309eb141be1a1eceeeb7475dd3b7529ad9) ([#8657](https://github.com/yt-dlp/yt-dlp/issues/8657)) by [NickCis](https://github.com/NickCis)
- **nebula**: [Overhaul extractors](https://github.com/yt-dlp/yt-dlp/commit/45d82be65f71bb05506bd55376c6fdb36bc54142) ([#8566](https://github.com/yt-dlp/yt-dlp/issues/8566)) by [elyse0](https://github.com/elyse0), [pukkandan](https://github.com/pukkandan), [seproDev](https://github.com/seproDev)
- **nintendo**: [Fix Nintendo Direct extraction](https://github.com/yt-dlp/yt-dlp/commit/1d24da6c899ef280d8b0a48a5e280ecd5d39cdf4) ([#8609](https://github.com/yt-dlp/yt-dlp/issues/8609)) by [Grub4K](https://github.com/Grub4K)
- **ondemandkorea**: [Fix upgraded format extraction](https://github.com/yt-dlp/yt-dlp/commit/04a5e06350e3ef7c03f94f2f3f90dd96c6411152) ([#8677](https://github.com/yt-dlp/yt-dlp/issues/8677)) by [seproDev](https://github.com/seproDev)
- **pr0gramm**: [Support variant formats and subtitles](https://github.com/yt-dlp/yt-dlp/commit/f98a3305eb124a0c375d03209d5c5a64fe1766c8) ([#8674](https://github.com/yt-dlp/yt-dlp/issues/8674)) by [Grub4K](https://github.com/Grub4K)
- **rinsefm**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/c91af948e43570025e4aa887e248fd025abae394) ([#8778](https://github.com/yt-dlp/yt-dlp/issues/8778)) by [hashFactory](https://github.com/hashFactory)
- **rudovideo**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/0d531c35eca4c2eb36e160530a7a333edbc727cc) ([#8664](https://github.com/yt-dlp/yt-dlp/issues/8664)) by [nicodato](https://github.com/nicodato)
- **theguardian**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/1fa3f24d4b5d22176b11d78420f1f4b64a5af0a8) ([#8535](https://github.com/yt-dlp/yt-dlp/issues/8535)) by [SirElderling](https://github.com/SirElderling)
- **theplatform**: [Extract more metadata](https://github.com/yt-dlp/yt-dlp/commit/7e09c147fdccb44806bbf601573adc4b77210a89) ([#8635](https://github.com/yt-dlp/yt-dlp/issues/8635)) by [trainman261](https://github.com/trainman261)
- **twitcasting**: [Detect livestreams via API and `show` page](https://github.com/yt-dlp/yt-dlp/commit/585d0ed9abcfcb957f2b2684b8ad43c3af160383) ([#8601](https://github.com/yt-dlp/yt-dlp/issues/8601)) by [bashonly](https://github.com/bashonly), [JC-Chung](https://github.com/JC-Chung)
- **twitcastinguser**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/ff2fde1b8f922fd34bae6172602008cd67c07c93) ([#8650](https://github.com/yt-dlp/yt-dlp/issues/8650)) by [bashonly](https://github.com/bashonly)
- **twitter**
- [Extract stale tweets](https://github.com/yt-dlp/yt-dlp/commit/1c54a98e19d047e7c15184237b6ef8ad50af489c) ([#8724](https://github.com/yt-dlp/yt-dlp/issues/8724)) by [bashonly](https://github.com/bashonly)
- [Prioritize m3u8 formats](https://github.com/yt-dlp/yt-dlp/commit/e7d22348e77367740da78a3db27167ecf894b7c9) ([#8826](https://github.com/yt-dlp/yt-dlp/issues/8826)) by [bashonly](https://github.com/bashonly)
- [Work around API rate-limit](https://github.com/yt-dlp/yt-dlp/commit/116c268438ea4d3738f6fa502c169081ca8f0ee7) ([#8825](https://github.com/yt-dlp/yt-dlp/issues/8825)) by [bashonly](https://github.com/bashonly)
- broadcast: [Extract `concurrent_view_count`](https://github.com/yt-dlp/yt-dlp/commit/6fe82491ed622b948c512cf4aab46ac3a234ae0a) ([#8600](https://github.com/yt-dlp/yt-dlp/issues/8600)) by [sonmezberkay](https://github.com/sonmezberkay)
- **vidly**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/34df1c1f60fa652c0a6a5c712b06c10e45daf6b7) ([#8612](https://github.com/yt-dlp/yt-dlp/issues/8612)) by [seproDev](https://github.com/seproDev)
- **vocaroo**: [Do not use deprecated `getheader`](https://github.com/yt-dlp/yt-dlp/commit/f223b1b0789f65e06619dcc9fc9e74f50d259379) ([#8606](https://github.com/yt-dlp/yt-dlp/issues/8606)) by [qbnu](https://github.com/qbnu)
- **vvvvid**: [Set user-agent to fix extraction](https://github.com/yt-dlp/yt-dlp/commit/1725e943b0e8a8b585305660d4611e684374409c) ([#8615](https://github.com/yt-dlp/yt-dlp/issues/8615)) by [Kyraminol](https://github.com/Kyraminol)
- **youtube**
- [Fix `like_count` extraction](https://github.com/yt-dlp/yt-dlp/commit/6b5d93b0b0240e287389d1d43b2d5293e18aa4cc) ([#8763](https://github.com/yt-dlp/yt-dlp/issues/8763)) by [Ganesh910](https://github.com/Ganesh910)
- [Improve detection of faulty HLS formats](https://github.com/yt-dlp/yt-dlp/commit/bb5a54e6db2422bbd155d93a0e105b6616c09467) ([#8646](https://github.com/yt-dlp/yt-dlp/issues/8646)) by [bashonly](https://github.com/bashonly)
- [Return empty playlist when channel/tab has no videos](https://github.com/yt-dlp/yt-dlp/commit/044886c220620a7679109e92352890e18b6079e3) by [pukkandan](https://github.com/pukkandan)
- [Support cf.piped.video](https://github.com/yt-dlp/yt-dlp/commit/6a9c7a2b52655bacfa7ab2da24fd0d14a6fff495) ([#8514](https://github.com/yt-dlp/yt-dlp/issues/8514)) by [OIRNOIR](https://github.com/OIRNOIR)
- **zingmp3**: [Add support for radio and podcasts](https://github.com/yt-dlp/yt-dlp/commit/64de1a4c25bada90374b88d7353754fe8fbfcc51) ([#7189](https://github.com/yt-dlp/yt-dlp/issues/7189)) by [hatienl0i261299](https://github.com/hatienl0i261299)
#### Postprocessor changes
- **ffmpegmetadata**: [Embed stream metadata in single format downloads](https://github.com/yt-dlp/yt-dlp/commit/deeb13eae82e60f82a2c0c5861f460399a997528) ([#8647](https://github.com/yt-dlp/yt-dlp/issues/8647)) by [bashonly](https://github.com/bashonly)
#### Networking changes
- [Strip whitespace around header values](https://github.com/yt-dlp/yt-dlp/commit/196eb0fe77b78e2e5ca02c506c3837c2b1a7964c) ([#8802](https://github.com/yt-dlp/yt-dlp/issues/8802)) by [coletdjnz](https://github.com/coletdjnz)
- **Request Handler**: websockets: [Migrate websockets to networking framework](https://github.com/yt-dlp/yt-dlp/commit/ccfd70f4c24b579c72123ca76ab50164f8f122b7) ([#7720](https://github.com/yt-dlp/yt-dlp/issues/7720)) by [coletdjnz](https://github.com/coletdjnz)
#### Misc. changes
- **ci**
- [Concurrency optimizations](https://github.com/yt-dlp/yt-dlp/commit/f124fa458826308afc86cf364c509f857686ecfd) ([#8614](https://github.com/yt-dlp/yt-dlp/issues/8614)) by [Grub4K](https://github.com/Grub4K)
- [Run core tests only for core changes](https://github.com/yt-dlp/yt-dlp/commit/13b3cb3c2b7169a1e17d6fc62593bf744170521c) ([#8841](https://github.com/yt-dlp/yt-dlp/issues/8841)) by [Grub4K](https://github.com/Grub4K)
- **cleanup**
- [Fix spelling of `IE_NAME`](https://github.com/yt-dlp/yt-dlp/commit/bc4ab17b38f01000d99c5c2bedec89721fee65ec) ([#8810](https://github.com/yt-dlp/yt-dlp/issues/8810)) by [barsnick](https://github.com/barsnick)
- [Remove dead extractors](https://github.com/yt-dlp/yt-dlp/commit/9751a457cfdb18bf99d9ee0d10e4e6a594502bbf) ([#8604](https://github.com/yt-dlp/yt-dlp/issues/8604)) by [seproDev](https://github.com/seproDev)
- Miscellaneous: [f9fb3ce](https://github.com/yt-dlp/yt-dlp/commit/f9fb3ce86e3c6a0c3c33b45392b8d7288bceba76) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K), [pukkandan](https://github.com/pukkandan), [seproDev](https://github.com/seproDev)
- **devscripts**: `run_tests`: [Create Python script](https://github.com/yt-dlp/yt-dlp/commit/2d1d683a541d71f3d3bb999dfe8eeb1976fb91ce) ([#8720](https://github.com/yt-dlp/yt-dlp/issues/8720)) by [Grub4K](https://github.com/Grub4K) (With fixes in [225cf2b](https://github.com/yt-dlp/yt-dlp/commit/225cf2b830a1de2c5eacd257edd2a01aed1e1114))
- **docs**: [Update youtube-dl merge commit in `README.md`](https://github.com/yt-dlp/yt-dlp/commit/f10589e3453009bb523f55849bba144c9b91cf2a) by [bashonly](https://github.com/bashonly)
- **test**: networking: [Update tests for OpenSSL 3.2](https://github.com/yt-dlp/yt-dlp/commit/37755a037e612bfc608c3d4722e8ef2ce6a022ee) ([#8814](https://github.com/yt-dlp/yt-dlp/issues/8814)) by [bashonly](https://github.com/bashonly)
### 2023.11.16 ### 2023.11.16
#### Extractor changes #### Extractor changes

View File

@@ -29,6 +29,7 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
[![gh-sponsor](https://img.shields.io/badge/_-Github-white.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/coletdjnz) [![gh-sponsor](https://img.shields.io/badge/_-Github-white.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/coletdjnz)
* Improved plugin architecture * Improved plugin architecture
* Rewrote the networking infrastructure, implemented support for `requests`
* YouTube improvements including: age-gate bypass, private playlists, multiple-clients (to avoid throttling) and a lot of under-the-hood improvements * YouTube improvements including: age-gate bypass, private playlists, multiple-clients (to avoid throttling) and a lot of under-the-hood improvements
* Added support for new websites YoutubeWebArchive, MainStreaming, PRX, nzherald, Mediaklikk, StarTV etc * Added support for new websites YoutubeWebArchive, MainStreaming, PRX, nzherald, Mediaklikk, StarTV etc
* Improved/fixed support for Patreon, panopto, gfycat, itv, pbs, SouthParkDE etc * Improved/fixed support for Patreon, panopto, gfycat, itv, pbs, SouthParkDE etc
@@ -46,16 +47,17 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
## [bashonly](https://github.com/bashonly) ## [bashonly](https://github.com/bashonly)
* `--update-to`, automated release, nightly builds * `--update-to`, self-updater rewrite, automated/nightly/master releases
* `--cookies-from-browser` support for Firefox containers * `--cookies-from-browser` support for Firefox containers, external downloader cookie handling overhaul
* Added support for new websites Genius, Kick, NBCStations, Triller, VideoKen etc * Added support for new websites like Dacast, Kick, NBCStations, Triller, VideoKen, Weverse, WrestleUniverse etc
* Improved/fixed support for Anvato, Brightcove, Instagram, ParamountPlus, Reddit, SlidesLive, TikTok, Twitter, Vimeo etc * Improved/fixed support for Anvato, Brightcove, Reddit, SlidesLive, TikTok, Twitter, Vimeo etc
## [Grub4K](https://github.com/Grub4K) ## [Grub4K](https://github.com/Grub4K)
[![ko-fi](https://img.shields.io/badge/_-Ko--fi-red.svg?logo=kofi&labelColor=555555&style=for-the-badge)](https://ko-fi.com/Grub4K) [![gh-sponsor](https://img.shields.io/badge/_-Github-white.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/Grub4K) [![gh-sponsor](https://img.shields.io/badge/_-Github-white.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/Grub4K) [![ko-fi](https://img.shields.io/badge/_-Ko--fi-red.svg?logo=kofi&labelColor=555555&style=for-the-badge)](https://ko-fi.com/Grub4K)
* `--update-to`, automated release, nightly builds * `--update-to`, self-updater rewrite, automated/nightly/master releases
* Rework internals like `traverse_obj`, various core refactors and bugs fixes * Reworked internals like `traverse_obj`, various core refactors and bugs fixes
* Helped fix crunchyroll, Twitter, wrestleuniverse, wistia, slideslive etc * Implemented proper progress reporting for parallel downloads
* Improved/fixed/added Bundestag, crunchyroll, pr0gramm, Twitter, WrestleUniverse etc

View File

@@ -76,7 +76,7 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
# NEW FEATURES # NEW FEATURES
* Forked from [**yt-dlc@f9401f2**](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee) and merged with [**youtube-dl@66ab08**](https://github.com/ytdl-org/youtube-dl/commit/66ab0814c4baa2dc79c2dd5287bc0ad61a37c5b9) ([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21)) * Forked from [**yt-dlc@f9401f2**](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee) and merged with [**youtube-dl@be008e6**](https://github.com/ytdl-org/youtube-dl/commit/be008e657d79832642e2158557c899249c9e31cd) ([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))
* **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in YouTube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API * **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in YouTube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
@@ -131,7 +131,7 @@ Features marked with a **\*** have been back-ported to youtube-dl
Some of yt-dlp's default options are different from that of youtube-dl and youtube-dlc: Some of yt-dlp's default options are different from that of youtube-dl and youtube-dlc:
* yt-dlp supports only [Python 3.7+](## "Windows 7"), and *may* remove support for more versions as they [become EOL](https://devguide.python.org/versions/#python-release-cycle); while [youtube-dl still supports Python 2.6+ and 3.2+](https://github.com/ytdl-org/youtube-dl/issues/30568#issue-1118238743) * yt-dlp supports only [Python 3.8+](## "Windows 7"), and *may* remove support for more versions as they [become EOL](https://devguide.python.org/versions/#python-release-cycle); while [youtube-dl still supports Python 2.6+ and 3.2+](https://github.com/ytdl-org/youtube-dl/issues/30568#issue-1118238743)
* The options `--auto-number` (`-A`), `--title` (`-t`) and `--literal` (`-l`), no longer work. See [removed options](#Removed) for details * The options `--auto-number` (`-A`), `--title` (`-t`) and `--literal` (`-l`), no longer work. See [removed options](#Removed) for details
* `avconv` is not supported as an alternative to `ffmpeg` * `avconv` is not supported as an alternative to `ffmpeg`
* yt-dlp stores config files in slightly different locations to youtube-dl. See [CONFIGURATION](#configuration) for a list of correct locations * yt-dlp stores config files in slightly different locations to youtube-dl. See [CONFIGURATION](#configuration) for a list of correct locations
@@ -159,6 +159,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* yt-dlp versions between 2021.09.01 and 2023.01.02 applies `--match-filter` to nested playlists. This was an unintentional side-effect of [8f18ac](https://github.com/yt-dlp/yt-dlp/commit/8f18aca8717bb0dd49054555af8d386e5eda3a88) and is fixed in [d7b460](https://github.com/yt-dlp/yt-dlp/commit/d7b460d0e5fc710950582baed2e3fc616ed98a80). Use `--compat-options playlist-match-filter` to revert this * yt-dlp versions between 2021.09.01 and 2023.01.02 applies `--match-filter` to nested playlists. This was an unintentional side-effect of [8f18ac](https://github.com/yt-dlp/yt-dlp/commit/8f18aca8717bb0dd49054555af8d386e5eda3a88) and is fixed in [d7b460](https://github.com/yt-dlp/yt-dlp/commit/d7b460d0e5fc710950582baed2e3fc616ed98a80). Use `--compat-options playlist-match-filter` to revert this
* yt-dlp versions between 2021.11.10 and 2023.06.21 estimated `filesize_approx` values for fragmented/manifest formats. This was added for convenience in [f2fe69](https://github.com/yt-dlp/yt-dlp/commit/f2fe69c7b0d208bdb1f6292b4ae92bc1e1a7444a), but was reverted in [0dff8e](https://github.com/yt-dlp/yt-dlp/commit/0dff8e4d1e6e9fb938f4256ea9af7d81f42fd54f) due to the potentially extreme inaccuracy of the estimated values. Use `--compat-options manifest-filesize-approx` to keep extracting the estimated values * yt-dlp versions between 2021.11.10 and 2023.06.21 estimated `filesize_approx` values for fragmented/manifest formats. This was added for convenience in [f2fe69](https://github.com/yt-dlp/yt-dlp/commit/f2fe69c7b0d208bdb1f6292b4ae92bc1e1a7444a), but was reverted in [0dff8e](https://github.com/yt-dlp/yt-dlp/commit/0dff8e4d1e6e9fb938f4256ea9af7d81f42fd54f) due to the potentially extreme inaccuracy of the estimated values. Use `--compat-options manifest-filesize-approx` to keep extracting the estimated values
* yt-dlp uses modern http client backends such as `requests`. Use `--compat-options prefer-legacy-http-handler` to prefer the legacy http handler (`urllib`) to be used for standard http requests. * yt-dlp uses modern http client backends such as `requests`. Use `--compat-options prefer-legacy-http-handler` to prefer the legacy http handler (`urllib`) to be used for standard http requests.
* The sub-module `swfinterp` is removed.
For ease of use, a few more compat options are available: For ease of use, a few more compat options are available:
@@ -266,7 +267,7 @@ gpg --verify SHA2-512SUMS.sig SHA2-512SUMS
**Note**: The manpages, shell completion (autocomplete) files etc. are available inside the [source tarball](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz) **Note**: The manpages, shell completion (autocomplete) files etc. are available inside the [source tarball](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz)
## DEPENDENCIES ## DEPENDENCIES
Python versions 3.7+ (CPython and PyPy) are supported. Other versions and implementations may or may not work correctly. Python versions 3.8+ (CPython and PyPy) are supported. Other versions and implementations may or may not work correctly.
<!-- Python 3.5+ uses VC++14 and it is already embedded in the binary created <!-- Python 3.5+ uses VC++14 and it is already embedded in the binary created
<!x-- https://www.microsoft.com/en-us/download/details.aspx?id=26999 --x> <!x-- https://www.microsoft.com/en-us/download/details.aspx?id=26999 --x>
@@ -299,7 +300,7 @@ While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly
* [**pycryptodomex**](https://github.com/Legrandin/pycryptodome)\* - For decrypting AES-128 HLS streams and various other data. Licensed under [BSD-2-Clause](https://github.com/Legrandin/pycryptodome/blob/master/LICENSE.rst) * [**pycryptodomex**](https://github.com/Legrandin/pycryptodome)\* - For decrypting AES-128 HLS streams and various other data. Licensed under [BSD-2-Clause](https://github.com/Legrandin/pycryptodome/blob/master/LICENSE.rst)
* [**phantomjs**](https://github.com/ariya/phantomjs) - Used in extractors where javascript needs to be run. Licensed under [BSD-3-Clause](https://github.com/ariya/phantomjs/blob/master/LICENSE.BSD) * [**phantomjs**](https://github.com/ariya/phantomjs) - Used in extractors where javascript needs to be run. Licensed under [BSD-3-Clause](https://github.com/ariya/phantomjs/blob/master/LICENSE.BSD)
* [**secretstorage**](https://github.com/mitya57/secretstorage) - For `--cookies-from-browser` to access the **Gnome** keyring while decrypting cookies of **Chromium**-based browsers on **Linux**. Licensed under [BSD-3-Clause](https://github.com/mitya57/secretstorage/blob/master/LICENSE) * [**secretstorage**](https://github.com/mitya57/secretstorage)\* - For `--cookies-from-browser` to access the **Gnome** keyring while decrypting cookies of **Chromium**-based browsers on **Linux**. Licensed under [BSD-3-Clause](https://github.com/mitya57/secretstorage/blob/master/LICENSE)
* Any external downloader that you want to use with `--downloader` * Any external downloader that you want to use with `--downloader`
### Deprecated ### Deprecated
@@ -334,7 +335,7 @@ On some systems, you may need to use `py` or `python` instead of `python3`.
**Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly. **Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly.
### Platform-independent Binary (UNIX) ### Platform-independent Binary (UNIX)
You will need the build tools `python` (3.7+), `zip`, `make` (GNU), `pandoc`\* and `pytest`\*. You will need the build tools `python` (3.8+), `zip`, `make` (GNU), `pandoc`\* and `pytest`\*.
After installing these, simply run `make`. After installing these, simply run `make`.
@@ -1268,7 +1269,7 @@ The field names themselves (the part inside the parenthesis) can also have some
1. **Object traversal**: The dictionaries and lists available in metadata can be traversed by using a dot `.` separator; e.g. `%(tags.0)s`, `%(subtitles.en.-1.ext)s`. You can do Python slicing with colon `:`; E.g. `%(id.3:7:-1)s`, `%(formats.:.format_id)s`. Curly braces `{}` can be used to build dictionaries with only specific keys; e.g. `%(formats.:.{format_id,height})#j`. An empty field name `%()s` refers to the entire infodict; e.g. `%(.{id,title})s`. Note that all the fields that become available using this method are not listed below. Use `-j` to see such fields 1. **Object traversal**: The dictionaries and lists available in metadata can be traversed by using a dot `.` separator; e.g. `%(tags.0)s`, `%(subtitles.en.-1.ext)s`. You can do Python slicing with colon `:`; E.g. `%(id.3:7:-1)s`, `%(formats.:.format_id)s`. Curly braces `{}` can be used to build dictionaries with only specific keys; e.g. `%(formats.:.{format_id,height})#j`. An empty field name `%()s` refers to the entire infodict; e.g. `%(.{id,title})s`. Note that all the fields that become available using this method are not listed below. Use `-j` to see such fields
1. **Addition**: Addition and subtraction of numeric fields can be done using `+` and `-` respectively. E.g. `%(playlist_index+10)03d`, `%(n_entries+1-playlist_index)d` 1. **Arithmetic**: Simple arithmetic can be done on numeric fields using `+`, `-` and `*`. E.g. `%(playlist_index+10)03d`, `%(n_entries+1-playlist_index)d`
1. **Date/time Formatting**: Date/time fields can be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it separated from the field name using a `>`. E.g. `%(duration>%H-%M-%S)s`, `%(upload_date>%Y-%m-%d)s`, `%(epoch-3600>%H-%M-%S)s` 1. **Date/time Formatting**: Date/time fields can be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it separated from the field name using a `>`. E.g. `%(duration>%H-%M-%S)s`, `%(upload_date>%Y-%m-%d)s`, `%(epoch-3600>%H-%M-%S)s`
@@ -1309,6 +1310,7 @@ The available fields are:
- `upload_date` (string): Video upload date in UTC (YYYYMMDD) - `upload_date` (string): Video upload date in UTC (YYYYMMDD)
- `release_timestamp` (numeric): UNIX timestamp of the moment the video was released - `release_timestamp` (numeric): UNIX timestamp of the moment the video was released
- `release_date` (string): The date (YYYYMMDD) when the video was released in UTC - `release_date` (string): The date (YYYYMMDD) when the video was released in UTC
- `release_year` (numeric): Year (YYYY) when the video or album was released
- `modified_timestamp` (numeric): UNIX timestamp of the moment the video was last modified - `modified_timestamp` (numeric): UNIX timestamp of the moment the video was last modified
- `modified_date` (string): The date (YYYYMMDD) when the video was last modified in UTC - `modified_date` (string): The date (YYYYMMDD) when the video was last modified in UTC
- `uploader_id` (string): Nickname or id of the video uploader - `uploader_id` (string): Nickname or id of the video uploader
@@ -1332,6 +1334,7 @@ The available fields are:
- `was_live` (boolean): Whether this video was originally a live stream - `was_live` (boolean): Whether this video was originally a live stream
- `playable_in_embed` (string): Whether this video is allowed to play in embedded players on other sites - `playable_in_embed` (string): Whether this video is allowed to play in embedded players on other sites
- `availability` (string): Whether the video is "private", "premium_only", "subscriber_only", "needs_auth", "unlisted" or "public" - `availability` (string): Whether the video is "private", "premium_only", "subscriber_only", "needs_auth", "unlisted" or "public"
- `media_type` (string): The type of media as classified by the site, e.g. "episode", "clip", "trailer"
- `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL - `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL
- `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL - `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL
- `extractor` (string): Name of the extractor - `extractor` (string): Name of the extractor
@@ -1382,7 +1385,6 @@ Available for the media that is a track or a part of a music album:
- `album_type` (string): Type of the album - `album_type` (string): Type of the album
- `album_artist` (string): List of all artists appeared on the album - `album_artist` (string): List of all artists appeared on the album
- `disc_number` (numeric): Number of the disc or other physical medium the track belongs to - `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
- `release_year` (numeric): Year (YYYY) when the album was released
Available only when using `--download-sections` and for `chapter:` prefix when using `--split-chapters` for videos with internal chapters: Available only when using `--download-sections` and for `chapter:` prefix when using `--split-chapters` for videos with internal chapters:

View File

@@ -114,5 +114,11 @@
"action": "add", "action": "add",
"when": "f04b5bedad7b281bee9814686bba1762bae092eb", "when": "f04b5bedad7b281bee9814686bba1762bae092eb",
"short": "[priority] Security: [[CVE-2023-46121](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-46121)] Patch [Generic Extractor MITM Vulnerability via Arbitrary Proxy Injection](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-3ch3-jhc6-5r8x)\n\t- Disallow smuggling of arbitrary `http_headers`; extractors now only use specific headers" "short": "[priority] Security: [[CVE-2023-46121](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-46121)] Patch [Generic Extractor MITM Vulnerability via Arbitrary Proxy Injection](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-3ch3-jhc6-5r8x)\n\t- Disallow smuggling of arbitrary `http_headers`; extractors now only use specific headers"
},
{
"action": "change",
"when": "15f22b4880b6b3f71f350c64d70976ae65b9f1ca",
"short": "[webvtt] Allow spaces before newlines for CueBlock (#7681)",
"authors": ["TSRBerry"]
} }
] ]

View File

@@ -40,20 +40,6 @@ class CommitGroup(enum.Enum):
return { return {
name: group name: group
for group, names in { for group, names in {
cls.CORE: {
'aes',
'cache',
'compat_utils',
'compat',
'cookies',
'dependencies',
'formats',
'jsinterp',
'outtmpl',
'plugins',
'update',
'utils',
},
cls.MISC: { cls.MISC: {
'build', 'build',
'ci', 'ci',
@@ -404,9 +390,9 @@ class CommitRange:
if not group: if not group:
if self.EXTRACTOR_INDICATOR_RE.search(commit.short): if self.EXTRACTOR_INDICATOR_RE.search(commit.short):
group = CommitGroup.EXTRACTOR group = CommitGroup.EXTRACTOR
logger.error(f'Assuming [ie] group for {commit.short!r}')
else: else:
group = CommitGroup.POSTPROCESSOR group = CommitGroup.CORE
logger.warning(f'Failed to map {commit.short!r}, selected {group.name.lower()}')
commit_info = CommitInfo( commit_info = CommitInfo(
details, sub_details, message.strip(), details, sub_details, message.strip(),

View File

@@ -9,11 +9,7 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import re import re
from devscripts.utils import ( from devscripts.utils import get_filename_args, read_file, write_file
get_filename_args,
read_file,
write_file,
)
VERBOSE_TMPL = ''' VERBOSE_TMPL = '''
- type: checkboxes - type: checkboxes

View File

@@ -1,17 +1,4 @@
@setlocal
@echo off @echo off
cd /d %~dp0..
if ["%~1"]==[""] ( >&2 echo run_tests.bat is deprecated. Please use `devscripts/run_tests.py` instead
set "test_set="test"" python %~dp0run_tests.py %~1
) else if ["%~1"]==["core"] (
set "test_set="-m not download""
) else if ["%~1"]==["download"] (
set "test_set="-m "download""
) else (
echo.Invalid test type "%~1". Use "core" ^| "download"
exit /b 1
)
set PYTHONWARNINGS=error
pytest %test_set%

71
devscripts/run_tests.py Executable file
View File

@@ -0,0 +1,71 @@
#!/usr/bin/env python3
import argparse
import functools
import os
import re
import subprocess
import sys
from pathlib import Path
fix_test_name = functools.partial(re.compile(r'IE(_all|_\d+)?$').sub, r'\1')
def parse_args():
parser = argparse.ArgumentParser(description='Run selected yt-dlp tests')
parser.add_argument(
'test', help='a extractor tests, or one of "core" or "download"', nargs='*')
parser.add_argument(
'-k', help='run a test matching EXPRESSION. Same as "pytest -k"', metavar='EXPRESSION')
return parser.parse_args()
def run_tests(*tests, pattern=None, ci=False):
run_core = 'core' in tests or (not pattern and not tests)
run_download = 'download' in tests
tests = list(map(fix_test_name, tests))
arguments = ['pytest', '-Werror', '--tb=short']
if ci:
arguments.append('--color=yes')
if run_core:
arguments.extend(['-m', 'not download'])
elif run_download:
arguments.extend(['-m', 'download'])
elif pattern:
arguments.extend(['-k', pattern])
else:
arguments.extend(
f'test/test_download.py::TestDownload::test_{test}' for test in tests)
print(f'Running {arguments}', flush=True)
try:
return subprocess.call(arguments)
except FileNotFoundError:
pass
arguments = [sys.executable, '-Werror', '-m', 'unittest']
if run_core:
print('"pytest" needs to be installed to run core tests', file=sys.stderr, flush=True)
return 1
elif run_download:
arguments.append('test.test_download')
elif pattern:
arguments.extend(['-k', pattern])
else:
arguments.extend(
f'test.test_download.TestDownload.test_{test}' for test in tests)
print(f'Running {arguments}', flush=True)
return subprocess.call(arguments)
if __name__ == '__main__':
try:
args = parse_args()
os.chdir(Path(__file__).parent.parent)
sys.exit(run_tests(*args.test, pattern=args.k, ci=bool(os.getenv('CI'))))
except KeyboardInterrupt:
pass

View File

@@ -1,14 +1,4 @@
#!/usr/bin/env sh #!/usr/bin/env sh
if [ -z "$1" ]; then >&2 echo 'run_tests.sh is deprecated. Please use `devscripts/run_tests.py` instead'
test_set='test' python3 devscripts/run_tests.py "$1"
elif [ "$1" = 'core' ]; then
test_set="-m not download"
elif [ "$1" = 'download' ]; then
test_set="-m download"
else
echo 'Invalid test type "'"$1"'". Use "core" | "download"'
exit 1
fi
python3 -bb -Werror -m pytest "$test_set"

View File

@@ -1,8 +1,8 @@
mutagen mutagen
pycryptodomex pycryptodomex
websockets
brotli; implementation_name=='cpython' brotli; implementation_name=='cpython'
brotlicffi; implementation_name!='cpython' brotlicffi; implementation_name!='cpython'
certifi certifi
requests>=2.31.0,<3 requests>=2.31.0,<3
urllib3>=1.26.17,<3 urllib3>=1.26.17,<3
websockets>=12.0

View File

@@ -26,7 +26,7 @@ markers =
[tox:tox] [tox:tox]
skipsdist = true skipsdist = true
envlist = py{36,37,38,39,310,311},pypy{36,37,38,39} envlist = py{38,39,310,311,312},pypy{38,39,310}
skip_missing_interpreters = true skip_missing_interpreters = true
[testenv] # tox [testenv] # tox
@@ -39,7 +39,7 @@ setenv =
[isort] [isort]
py_version = 37 py_version = 38
multi_line_output = VERTICAL_HANGING_INDENT multi_line_output = VERTICAL_HANGING_INDENT
line_length = 80 line_length = 80
reverse_relative = true reverse_relative = true

View File

@@ -152,7 +152,7 @@ def main():
url='https://github.com/yt-dlp/yt-dlp', url='https://github.com/yt-dlp/yt-dlp',
packages=packages(), packages=packages(),
install_requires=REQUIREMENTS, install_requires=REQUIREMENTS,
python_requires='>=3.7', python_requires='>=3.8',
project_urls={ project_urls={
'Documentation': 'https://github.com/yt-dlp/yt-dlp#readme', 'Documentation': 'https://github.com/yt-dlp/yt-dlp#readme',
'Source': 'https://github.com/yt-dlp/yt-dlp', 'Source': 'https://github.com/yt-dlp/yt-dlp',
@@ -164,11 +164,11 @@ def main():
'Development Status :: 5 - Production/Stable', 'Development Status :: 5 - Production/Stable',
'Environment :: Console', 'Environment :: Console',
'Programming Language :: Python', 'Programming Language :: Python',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8', 'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9', 'Programming Language :: Python :: 3.9',
'Programming Language :: Python :: 3.10', 'Programming Language :: Python :: 3.10',
'Programming Language :: Python :: 3.11', 'Programming Language :: Python :: 3.11',
'Programming Language :: Python :: 3.12',
'Programming Language :: Python :: Implementation', 'Programming Language :: Python :: Implementation',
'Programming Language :: Python :: Implementation :: CPython', 'Programming Language :: Python :: Implementation :: CPython',
'Programming Language :: Python :: Implementation :: PyPy', 'Programming Language :: Python :: Implementation :: PyPy',

View File

@@ -1,6 +1,4 @@
# Supported sites # Supported sites
- **0000studio:archive**
- **0000studio:clip**
- **17live** - **17live**
- **17live:clip** - **17live:clip**
- **1News**: 1news.co.nz article videos - **1News**: 1news.co.nz article videos
@@ -9,7 +7,6 @@
- **23video** - **23video**
- **247sports** - **247sports**
- **24tv.ua** - **24tv.ua**
- **24video**
- **3qsdn**: 3Q SDN - **3qsdn**: 3Q SDN
- **3sat** - **3sat**
- **4tube** - **4tube**
@@ -50,15 +47,18 @@
- **afreecatv**: [*afreecatv*](## "netrc machine") afreecatv.com - **afreecatv**: [*afreecatv*](## "netrc machine") afreecatv.com
- **afreecatv:live**: [*afreecatv*](## "netrc machine") afreecatv.com - **afreecatv:live**: [*afreecatv*](## "netrc machine") afreecatv.com
- **afreecatv:user** - **afreecatv:user**
- **AirMozilla**
- **AirTV** - **AirTV**
- **AitubeKZVideo** - **AitubeKZVideo**
- **AliExpressLive** - **AliExpressLive**
- **AlJazeera** - **AlJazeera**
- **Allocine** - **Allocine**
- **Allstar**
- **AllstarProfile**
- **AlphaPorno** - **AlphaPorno**
- **Alsace20TV** - **Alsace20TV**
- **Alsace20TVEmbed** - **Alsace20TVEmbed**
- **altcensored**
- **altcensored:channel**
- **Alura**: [*alura*](## "netrc machine") - **Alura**: [*alura*](## "netrc machine")
- **AluraCourse**: [*aluracourse*](## "netrc machine") - **AluraCourse**: [*aluracourse*](## "netrc machine")
- **Amara** - **Amara**
@@ -79,7 +79,7 @@
- **ant1newsgr:embed**: ant1news.gr embedded videos - **ant1newsgr:embed**: ant1news.gr embedded videos
- **antenna:watch**: antenna.gr and ant1news.gr videos - **antenna:watch**: antenna.gr and ant1news.gr videos
- **Anvato** - **Anvato**
- **aol.com**: Yahoo screen and movies - **aol.com**: Yahoo screen and movies (**Currently broken**)
- **APA** - **APA**
- **Aparat** - **Aparat**
- **AppleConnect** - **AppleConnect**
@@ -90,8 +90,8 @@
- **archive.org**: archive.org video and audio - **archive.org**: archive.org video and audio
- **ArcPublishing** - **ArcPublishing**
- **ARD** - **ARD**
- **ARD:mediathek** - **ARDMediathek**
- **ARDBetaMediathek** - **ARDMediathekCollection**
- **Arkena** - **Arkena**
- **arte.sky.it** - **arte.sky.it**
- **ArteTV** - **ArteTV**
@@ -100,7 +100,6 @@
- **ArteTVPlaylist** - **ArteTVPlaylist**
- **AtresPlayer**: [*atresplayer*](## "netrc machine") - **AtresPlayer**: [*atresplayer*](## "netrc machine")
- **AtScaleConfEvent** - **AtScaleConfEvent**
- **ATTTechChannel**
- **ATVAt** - **ATVAt**
- **AudiMedia** - **AudiMedia**
- **AudioBoom** - **AudioBoom**
@@ -140,12 +139,12 @@
- **BeatBumpVideo** - **BeatBumpVideo**
- **Beatport** - **Beatport**
- **Beeg** - **Beeg**
- **BehindKink** - **BehindKink**: (**Currently broken**)
- **Bellator** - **Bellator**
- **BellMedia** - **BellMedia**
- **BerufeTV** - **BerufeTV**
- **Bet** - **Bet**: (**Currently broken**)
- **bfi:player** - **bfi:player**: (**Currently broken**)
- **bfmtv** - **bfmtv**
- **bfmtv:article** - **bfmtv:article**
- **bfmtv:live** - **bfmtv:live**
@@ -162,6 +161,8 @@
- **BiliBiliBangumi** - **BiliBiliBangumi**
- **BiliBiliBangumiMedia** - **BiliBiliBangumiMedia**
- **BiliBiliBangumiSeason** - **BiliBiliBangumiSeason**
- **BilibiliCheese**
- **BilibiliCheeseSeason**
- **BilibiliCollectionList** - **BilibiliCollectionList**
- **BilibiliFavoritesList** - **BilibiliFavoritesList**
- **BiliBiliPlayer** - **BiliBiliPlayer**
@@ -176,11 +177,8 @@
- **BiliLive** - **BiliLive**
- **BioBioChileTV** - **BioBioChileTV**
- **Biography** - **Biography**
- **BIQLE**
- **BitChute** - **BitChute**
- **BitChuteChannel** - **BitChuteChannel**
- **bitwave:replay**
- **bitwave:stream**
- **BlackboardCollaborate** - **BlackboardCollaborate**
- **BleacherReport** - **BleacherReport**
- **BleacherReportCMS** - **BleacherReportCMS**
@@ -193,7 +191,7 @@
- **Box** - **Box**
- **BoxCastVideo** - **BoxCastVideo**
- **Bpb**: Bundeszentrale für politische Bildung - **Bpb**: Bundeszentrale für politische Bildung
- **BR**: Bayerischer Rundfunk - **BR**: Bayerischer Rundfunk (**Currently broken**)
- **BrainPOP**: [*brainpop*](## "netrc machine") - **BrainPOP**: [*brainpop*](## "netrc machine")
- **BrainPOPELL**: [*brainpop*](## "netrc machine") - **BrainPOPELL**: [*brainpop*](## "netrc machine")
- **BrainPOPEsp**: [*brainpop*](## "netrc machine") BrainPOP Español - **BrainPOPEsp**: [*brainpop*](## "netrc machine") BrainPOP Español
@@ -201,19 +199,18 @@
- **BrainPOPIl**: [*brainpop*](## "netrc machine") BrainPOP Hebrew - **BrainPOPIl**: [*brainpop*](## "netrc machine") BrainPOP Hebrew
- **BrainPOPJr**: [*brainpop*](## "netrc machine") - **BrainPOPJr**: [*brainpop*](## "netrc machine")
- **BravoTV** - **BravoTV**
- **Break**
- **BreitBart** - **BreitBart**
- **brightcove:legacy** - **brightcove:legacy**
- **brightcove:new** - **brightcove:new**
- **Brilliantpala:Classes**: [*brilliantpala*](## "netrc machine") VoD on classes.brilliantpala.org - **Brilliantpala:Classes**: [*brilliantpala*](## "netrc machine") VoD on classes.brilliantpala.org
- **Brilliantpala:Elearn**: [*brilliantpala*](## "netrc machine") VoD on elearn.brilliantpala.org - **Brilliantpala:Elearn**: [*brilliantpala*](## "netrc machine") VoD on elearn.brilliantpala.org
- **BRMediathek**: Bayerischer Rundfunk Mediathek
- **bt:article**: Bergens Tidende Articles - **bt:article**: Bergens Tidende Articles
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen - **bt:vestlendingen**: Bergens Tidende - Vestlendingen
- **Bundesliga** - **Bundesliga**
- **Bundestag**
- **BusinessInsider** - **BusinessInsider**
- **BuzzFeed** - **BuzzFeed**
- **BYUtv** - **BYUtv**: (**Currently broken**)
- **CableAV** - **CableAV**
- **Callin** - **Callin**
- **Caltrans** - **Caltrans**
@@ -225,14 +222,11 @@
- **CamModels** - **CamModels**
- **Camsoda** - **Camsoda**
- **CamtasiaEmbed** - **CamtasiaEmbed**
- **CamWithHer**
- **Canal1** - **Canal1**
- **CanalAlpha** - **CanalAlpha**
- **canalc2.tv** - **canalc2.tv**
- **Canalplus**: mycanal.fr and piwiplus.fr - **Canalplus**: mycanal.fr and piwiplus.fr
- **CaracolTvPlay**: [*caracoltv-play*](## "netrc machine") - **CaracolTvPlay**: [*caracoltv-play*](## "netrc machine")
- **CarambaTV**
- **CarambaTVPage**
- **CartoonNetwork** - **CartoonNetwork**
- **cbc.ca** - **cbc.ca**
- **cbc.ca:player** - **cbc.ca:player**
@@ -254,16 +248,12 @@
- **Cellebrite** - **Cellebrite**
- **CeskaTelevize** - **CeskaTelevize**
- **CGTN** - **CGTN**
- **channel9**: Channel 9
- **CharlieRose** - **CharlieRose**
- **Chaturbate** - **Chaturbate**
- **Chilloutzone** - **Chilloutzone**
- **Chingari** - **Chingari**
- **ChingariUser** - **ChingariUser**
- **chirbit**
- **chirbit:profile**
- **cielotv.it** - **cielotv.it**
- **Cinchcast**
- **Cinemax** - **Cinemax**
- **CinetecaMilano** - **CinetecaMilano**
- **Cineverse** - **Cineverse**
@@ -276,14 +266,12 @@
- **cliphunter** - **cliphunter**
- **Clippit** - **Clippit**
- **ClipRs** - **ClipRs**
- **Clipsyndicate**
- **ClipYouEmbed** - **ClipYouEmbed**
- **CloserToTruth** - **CloserToTruth**
- **CloudflareStream** - **CloudflareStream**
- **Cloudy** - **Clubic**: (**Currently broken**)
- **Clubic**
- **Clyp** - **Clyp**
- **cmt.com** - **cmt.com**: (**Currently broken**)
- **CNBC** - **CNBC**
- **CNBCVideo** - **CNBCVideo**
- **CNN** - **CNN**
@@ -328,7 +316,6 @@
- **CybraryCourse**: [*cybrary*](## "netrc machine") - **CybraryCourse**: [*cybrary*](## "netrc machine")
- **DacastPlaylist** - **DacastPlaylist**
- **DacastVOD** - **DacastVOD**
- **Daftsex**
- **DagelijkseKost**: dagelijksekost.een.be - **DagelijkseKost**: dagelijksekost.een.be
- **DailyMail** - **DailyMail**
- **dailymotion**: [*dailymotion*](## "netrc machine") - **dailymotion**: [*dailymotion*](## "netrc machine")
@@ -347,13 +334,12 @@
- **DctpTv** - **DctpTv**
- **DeezerAlbum** - **DeezerAlbum**
- **DeezerPlaylist** - **DeezerPlaylist**
- **defense.gouv.fr**
- **democracynow** - **democracynow**
- **DestinationAmerica** - **DestinationAmerica**
- **DetikEmbed** - **DetikEmbed**
- **DeuxM** - **DeuxM**
- **DeuxMNews** - **DeuxMNews**
- **DHM**: Filmarchiv - Deutsches Historisches Museum - **DHM**: Filmarchiv - Deutsches Historisches Museum (**Currently broken**)
- **Digg** - **Digg**
- **DigitalConcertHall**: [*digitalconcerthall*](## "netrc machine") DigitalConcertHall extractor - **DigitalConcertHall**: [*digitalconcerthall*](## "netrc machine") DigitalConcertHall extractor
- **DigitallySpeaking** - **DigitallySpeaking**
@@ -373,7 +359,6 @@
- **dlf:corpus**: DLF Multi-feed Archives - **dlf:corpus**: DLF Multi-feed Archives
- **dlive:stream** - **dlive:stream**
- **dlive:vod** - **dlive:vod**
- **Dotsub**
- **Douyin** - **Douyin**
- **DouyuShow** - **DouyuShow**
- **DouyuTV**: 斗鱼直播 - **DouyuTV**: 斗鱼直播
@@ -392,35 +377,29 @@
- **duboku**: www.duboku.io - **duboku**: www.duboku.io
- **duboku:list**: www.duboku.io entire series - **duboku:list**: www.duboku.io entire series
- **Dumpert** - **Dumpert**
- **Duoplay**
- **dvtv**: http://video.aktualne.cz/ - **dvtv**: http://video.aktualne.cz/
- **dw** - **dw**
- **dw:article** - **dw:article**
- **EaglePlatform** - **EaglePlatform**
- **EbaumsWorld** - **EbaumsWorld**
- **Ebay** - **Ebay**
- **EchoMsk**
- **egghead:course**: egghead.io course - **egghead:course**: egghead.io course
- **egghead:lesson**: egghead.io lesson - **egghead:lesson**: egghead.io lesson
- **ehftv**
- **eHow**
- **EinsUndEinsTV**: [*1und1tv*](## "netrc machine") - **EinsUndEinsTV**: [*1und1tv*](## "netrc machine")
- **EinsUndEinsTVLive**: [*1und1tv*](## "netrc machine") - **EinsUndEinsTVLive**: [*1und1tv*](## "netrc machine")
- **EinsUndEinsTVRecordings**: [*1und1tv*](## "netrc machine") - **EinsUndEinsTVRecordings**: [*1und1tv*](## "netrc machine")
- **Einthusan** - **Einthusan**
- **eitb.tv** - **eitb.tv**
- **ElevenSports**
- **EllenTube**
- **EllenTubePlaylist**
- **EllenTubeVideo**
- **Elonet** - **Elonet**
- **ElPais**: El País - **ElPais**: El País
- **ElTreceTV**: El Trece TV (Argentina) - **ElTreceTV**: El Trece TV (Argentina)
- **Embedly** - **Embedly**
- **EMPFlix** - **EMPFlix**
- **Engadget**
- **Epicon** - **Epicon**
- **EpiconSeries** - **EpiconSeries**
- **eplus:inbound**: e+ (イープラス) overseas - **EpidemicSound**
- **eplus**: [*eplus*](## "netrc machine") e+ (イープラス)
- **Epoch** - **Epoch**
- **Eporner** - **Eporner**
- **Erocast** - **Erocast**
@@ -429,11 +408,9 @@
- **ertflix**: ERTFLIX videos - **ertflix**: ERTFLIX videos
- **ertflix:codename**: ERTFLIX videos by codename - **ertflix:codename**: ERTFLIX videos by codename
- **ertwebtv:embed**: ert.gr webtv embedded videos - **ertwebtv:embed**: ert.gr webtv embedded videos
- **Escapist**
- **ESPN** - **ESPN**
- **ESPNArticle** - **ESPNArticle**
- **ESPNCricInfo** - **ESPNCricInfo**
- **EsriVideo**
- **EttuTv** - **EttuTv**
- **Europa** - **Europa**
- **EuroParlWebstream** - **EuroParlWebstream**
@@ -443,9 +420,7 @@
- **EWETV**: [*ewetv*](## "netrc machine") - **EWETV**: [*ewetv*](## "netrc machine")
- **EWETVLive**: [*ewetv*](## "netrc machine") - **EWETVLive**: [*ewetv*](## "netrc machine")
- **EWETVRecordings**: [*ewetv*](## "netrc machine") - **EWETVRecordings**: [*ewetv*](## "netrc machine")
- **ExpoTV**
- **Expressen** - **Expressen**
- **ExtremeTube**
- **EyedoTV** - **EyedoTV**
- **facebook**: [*facebook*](## "netrc machine") - **facebook**: [*facebook*](## "netrc machine")
- **facebook:reel** - **facebook:reel**
@@ -465,6 +440,8 @@
- **FiveThirtyEight** - **FiveThirtyEight**
- **FiveTV** - **FiveTV**
- **Flickr** - **Flickr**
- **Floatplane**
- **FloatplaneChannel**
- **Folketinget**: Folketinget (ft.dk; Danish parliament) - **Folketinget**: Folketinget (ft.dk; Danish parliament)
- **FoodNetwork** - **FoodNetwork**
- **FootyRoom** - **FootyRoom**
@@ -472,7 +449,6 @@
- **FOX** - **FOX**
- **FOX9** - **FOX9**
- **FOX9News** - **FOX9News**
- **Foxgay**
- **foxnews**: Fox News and Fox Business Video - **foxnews**: Fox News and Fox Business Video
- **foxnews:article** - **foxnews:article**
- **FoxNewsVideo** - **FoxNewsVideo**
@@ -496,7 +472,6 @@
- **funimation:show**: [*funimation*](## "netrc machine") - **funimation:show**: [*funimation*](## "netrc machine")
- **Funk** - **Funk**
- **Funker530** - **Funker530**
- **Fusion**
- **Fux** - **Fux**
- **FuyinTV** - **FuyinTV**
- **Gab** - **Gab**
@@ -522,7 +497,6 @@
- **GeniusLyrics** - **GeniusLyrics**
- **Gettr** - **Gettr**
- **GettrStreaming** - **GettrStreaming**
- **Gfycat**
- **GiantBomb** - **GiantBomb**
- **Giga** - **Giga**
- **GlattvisionTV**: [*glattvisiontv*](## "netrc machine") - **GlattvisionTV**: [*glattvisiontv*](## "netrc machine")
@@ -564,7 +538,6 @@
- **HearThisAt** - **HearThisAt**
- **Heise** - **Heise**
- **HellPorno** - **HellPorno**
- **Helsinki**: helsinki.fi
- **hetklokhuis** - **hetklokhuis**
- **hgtv.com:show** - **hgtv.com:show**
- **HGTVDe** - **HGTVDe**
@@ -573,8 +546,6 @@
- **HistoricFilms** - **HistoricFilms**
- **history:player** - **history:player**
- **history:topic**: History.com Topic - **history:topic**: History.com Topic
- **hitbox**
- **hitbox:live**
- **HitRecord** - **HitRecord**
- **hketv**: 香港教育局教育電視 (HKETV) Educational Television, Hong Kong Educational Bureau - **hketv**: 香港教育局教育電視 (HKETV) Educational Television, Hong Kong Educational Bureau
- **HollywoodReporter** - **HollywoodReporter**
@@ -585,8 +556,6 @@
- **hotstar:playlist** - **hotstar:playlist**
- **hotstar:season** - **hotstar:season**
- **hotstar:series** - **hotstar:series**
- **Howcast**
- **HowStuffWorks**
- **hrfernsehen** - **hrfernsehen**
- **HRTi**: [*hrti*](## "netrc machine") - **HRTi**: [*hrti*](## "netrc machine")
- **HRTiPlaylist**: [*hrti*](## "netrc machine") - **HRTiPlaylist**: [*hrti*](## "netrc machine")
@@ -608,7 +577,7 @@
- **ign.com** - **ign.com**
- **IGNArticle** - **IGNArticle**
- **IGNVideo** - **IGNVideo**
- **IHeartRadio** - **iheartradio**
- **iheartradio:podcast** - **iheartradio:podcast**
- **Iltalehti** - **Iltalehti**
- **imdb**: Internet Movie Database trailers - **imdb**: Internet Movie Database trailers
@@ -638,7 +607,6 @@
- **IsraelNationalNews** - **IsraelNationalNews**
- **ITProTV** - **ITProTV**
- **ITProTVCourse** - **ITProTVCourse**
- **ITTF**
- **ITV** - **ITV**
- **ITVBTCC** - **ITVBTCC**
- **ivi**: ivi.ru - **ivi**: ivi.ru
@@ -658,6 +626,7 @@
- **JioSaavnAlbum** - **JioSaavnAlbum**
- **JioSaavnSong** - **JioSaavnSong**
- **Joj** - **Joj**
- **JoqrAg**: 超!A&G+ 文化放送 (f.k.a. AGQR) Nippon Cultural Broadcasting, Inc. (JOQR)
- **Jove** - **Jove**
- **JStream** - **JStream**
- **JTBC**: jtbc.co.kr - **JTBC**: jtbc.co.kr
@@ -670,7 +639,6 @@
- **Karaoketv** - **Karaoketv**
- **KarriereVideos** - **KarriereVideos**
- **Katsomo** - **Katsomo**
- **KeezMovies**
- **KelbyOne** - **KelbyOne**
- **Ketnet** - **Ketnet**
- **khanacademy** - **khanacademy**
@@ -679,7 +647,7 @@
- **Kicker** - **Kicker**
- **KickStarter** - **KickStarter**
- **KickVOD** - **KickVOD**
- **KinjaEmbed** - **kinja:embed**
- **KinoPoisk** - **KinoPoisk**
- **Kommunetv** - **Kommunetv**
- **KompasVideo** - **KompasVideo**
@@ -698,8 +666,6 @@
- **la7.it** - **la7.it**
- **la7.it:pod:episode** - **la7.it:pod:episode**
- **la7.it:podcast** - **la7.it:podcast**
- **laola1tv**
- **laola1tv:embed**
- **LastFM** - **LastFM**
- **LastFMPlaylist** - **LastFMPlaylist**
- **LastFMUser** - **LastFMUser**
@@ -733,7 +699,6 @@
- **LinkedIn**: [*linkedin*](## "netrc machine") - **LinkedIn**: [*linkedin*](## "netrc machine")
- **linkedin:learning**: [*linkedin*](## "netrc machine") - **linkedin:learning**: [*linkedin*](## "netrc machine")
- **linkedin:learning:course**: [*linkedin*](## "netrc machine") - **linkedin:learning:course**: [*linkedin*](## "netrc machine")
- **LinuxAcademy**: [*linuxacademy*](## "netrc machine")
- **Liputan6** - **Liputan6**
- **ListenNotes** - **ListenNotes**
- **LiTV** - **LiTV**
@@ -751,7 +716,7 @@
- **Lumni** - **Lumni**
- **lynda**: [*lynda*](## "netrc machine") lynda.com videos - **lynda**: [*lynda*](## "netrc machine") lynda.com videos
- **lynda:course**: [*lynda*](## "netrc machine") lynda.com online courses - **lynda:course**: [*lynda*](## "netrc machine") lynda.com online courses
- **m6** - **maariv.co.il**
- **MagellanTV** - **MagellanTV**
- **MagentaMusik360** - **MagentaMusik360**
- **mailru**: Видео@Mail.Ru - **mailru**: Видео@Mail.Ru
@@ -793,11 +758,8 @@
- **megatvcom:embed**: megatv.com embedded videos - **megatvcom:embed**: megatv.com embedded videos
- **Meipai**: 美拍 - **Meipai**: 美拍
- **MelonVOD** - **MelonVOD**
- **META**
- **metacafe**
- **Metacritic** - **Metacritic**
- **mewatch** - **mewatch**
- **Mgoon**
- **MiaoPai** - **MiaoPai**
- **MicrosoftEmbed** - **MicrosoftEmbed**
- **microsoftstream**: Microsoft Stream - **microsoftstream**: Microsoft Stream
@@ -810,7 +772,6 @@
- **minds:group** - **minds:group**
- **MinistryGrid** - **MinistryGrid**
- **Minoto** - **Minoto**
- **miomio.tv**
- **mirrativ** - **mirrativ**
- **mirrativ:user** - **mirrativ:user**
- **MirrorCoUK** - **MirrorCoUK**
@@ -825,14 +786,10 @@
- **MLBTV**: [*mlb*](## "netrc machine") - **MLBTV**: [*mlb*](## "netrc machine")
- **MLBVideo** - **MLBVideo**
- **MLSSoccer** - **MLSSoccer**
- **Mnet**
- **MNetTV**: [*mnettv*](## "netrc machine") - **MNetTV**: [*mnettv*](## "netrc machine")
- **MNetTVLive**: [*mnettv*](## "netrc machine") - **MNetTVLive**: [*mnettv*](## "netrc machine")
- **MNetTVRecordings**: [*mnettv*](## "netrc machine") - **MNetTVRecordings**: [*mnettv*](## "netrc machine")
- **MochaVideo** - **MochaVideo**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex**
- **MofosexEmbed**
- **Mojvideo** - **Mojvideo**
- **Monstercat** - **Monstercat**
- **MonsterSirenHypergryphMusic** - **MonsterSirenHypergryphMusic**
@@ -843,13 +800,12 @@
- **Motorsport**: motorsport.com - **Motorsport**: motorsport.com
- **MotorTrend** - **MotorTrend**
- **MotorTrendOnDemand** - **MotorTrendOnDemand**
- **MovieClips**
- **MovieFap** - **MovieFap**
- **Moviepilot** - **Moviepilot**
- **MoviewPlay** - **MoviewPlay**
- **Moviezine** - **Moviezine**
- **MovingImage** - **MovingImage**
- **MSN** - **MSN**: (**Currently broken**)
- **mtg**: MTG services - **mtg**: MTG services
- **mtv** - **mtv**
- **mtv.de** - **mtv.de**
@@ -871,18 +827,13 @@
- **MusicdexSong** - **MusicdexSong**
- **mva**: Microsoft Virtual Academy videos - **mva**: Microsoft Virtual Academy videos
- **mva:course**: Microsoft Virtual Academy courses - **mva:course**: Microsoft Virtual Academy courses
- **Mwave**
- **MwaveMeetGreet**
- **Mxplayer** - **Mxplayer**
- **MxplayerShow** - **MxplayerShow**
- **MyChannels**
- **MySpace** - **MySpace**
- **MySpace:album** - **MySpace:album**
- **MySpass** - **MySpass**
- **Myvi**
- **MyVideoGe** - **MyVideoGe**
- **MyVidster** - **MyVidster**
- **MyviEmbed**
- **Mzaalo** - **Mzaalo**
- **n-tv.de** - **n-tv.de**
- **N1Info:article** - **N1Info:article**
@@ -894,12 +845,12 @@
- **Naver** - **Naver**
- **Naver:live** - **Naver:live**
- **navernow** - **navernow**
- **NBA** - **nba**
- **nba:channel**
- **nba:embed**
- **nba:watch** - **nba:watch**
- **nba:watch:collection** - **nba:watch:collection**
- **NBAChannel** - **nba:watch:embed**
- **NBAEmbed**
- **NBAWatchEmbed**
- **NBC** - **NBC**
- **NBCNews** - **NBCNews**
- **nbcolympics** - **nbcolympics**
@@ -914,6 +865,7 @@
- **NDTV** - **NDTV**
- **Nebula**: [*watchnebula*](## "netrc machine") - **Nebula**: [*watchnebula*](## "netrc machine")
- **nebula:channel**: [*watchnebula*](## "netrc machine") - **nebula:channel**: [*watchnebula*](## "netrc machine")
- **nebula:class**: [*watchnebula*](## "netrc machine")
- **nebula:subscriptions**: [*watchnebula*](## "netrc machine") - **nebula:subscriptions**: [*watchnebula*](## "netrc machine")
- **NekoHacker** - **NekoHacker**
- **NerdCubedFeed** - **NerdCubedFeed**
@@ -935,7 +887,6 @@
- **Newgrounds:playlist** - **Newgrounds:playlist**
- **Newgrounds:user** - **Newgrounds:user**
- **NewsPicks** - **NewsPicks**
- **Newstube**
- **Newsy** - **Newsy**
- **NextMedia**: 蘋果日報 - **NextMedia**: 蘋果日報
- **NextMediaActionNews**: 蘋果日報 - 動新聞 - **NextMediaActionNews**: 蘋果日報 - 動新聞
@@ -961,7 +912,6 @@
- **nick.de** - **nick.de**
- **nickelodeon:br** - **nickelodeon:br**
- **nickelodeonru** - **nickelodeonru**
- **nicknight**
- **niconico**: [*niconico*](## "netrc machine") ニコニコ動画 - **niconico**: [*niconico*](## "netrc machine") ニコニコ動画
- **niconico:history**: NicoNico user history or likes. Requires cookies. - **niconico:history**: NicoNico user history or likes. Requires cookies.
- **niconico:live**: ニコニコ生放送 - **niconico:live**: ニコニコ生放送
@@ -984,9 +934,7 @@
- **NonkTube** - **NonkTube**
- **NoodleMagazine** - **NoodleMagazine**
- **Noovo** - **Noovo**
- **Normalboots**
- **NOSNLArticle** - **NOSNLArticle**
- **NosVideo**
- **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz - **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz
- **NovaEmbed** - **NovaEmbed**
- **NovaPlay** - **NovaPlay**
@@ -1009,7 +957,7 @@
- **NRKTVEpisodes** - **NRKTVEpisodes**
- **NRKTVSeason** - **NRKTVSeason**
- **NRKTVSeries** - **NRKTVSeries**
- **NRLTV** - **NRLTV**: (**Currently broken**)
- **ntv.ru** - **ntv.ru**
- **NubilesPorn**: [*nubiles-porn*](## "netrc machine") - **NubilesPorn**: [*nubiles-porn*](## "netrc machine")
- **Nuvid** - **Nuvid**
@@ -1037,8 +985,6 @@
- **onet.tv:channel** - **onet.tv:channel**
- **OnetMVP** - **OnetMVP**
- **OnionStudios** - **OnionStudios**
- **Ooyala**
- **OoyalaExternal**
- **Opencast** - **Opencast**
- **OpencastPlaylist** - **OpencastPlaylist**
- **openrec** - **openrec**
@@ -1060,7 +1006,6 @@
- **PalcoMP3:artist** - **PalcoMP3:artist**
- **PalcoMP3:song** - **PalcoMP3:song**
- **PalcoMP3:video** - **PalcoMP3:video**
- **pandora.tv**: 판도라TV
- **Panopto** - **Panopto**
- **PanoptoList** - **PanoptoList**
- **PanoptoPlaylist** - **PanoptoPlaylist**
@@ -1082,7 +1027,6 @@
- **PeerTube:Playlist** - **PeerTube:Playlist**
- **peloton**: [*peloton*](## "netrc machine") - **peloton**: [*peloton*](## "netrc machine")
- **peloton:live**: Peloton Live - **peloton:live**: Peloton Live
- **People**
- **PerformGroup** - **PerformGroup**
- **periscope**: Periscope - **periscope**: Periscope
- **periscope:user**: Periscope user videos - **periscope:user**: Periscope user videos
@@ -1104,14 +1048,11 @@
- **PlanetMarathi** - **PlanetMarathi**
- **Platzi**: [*platzi*](## "netrc machine") - **Platzi**: [*platzi*](## "netrc machine")
- **PlatziCourse**: [*platzi*](## "netrc machine") - **PlatziCourse**: [*platzi*](## "netrc machine")
- **play.fm**
- **player.sky.it** - **player.sky.it**
- **PlayPlusTV**: [*playplustv*](## "netrc machine") - **PlayPlusTV**: [*playplustv*](## "netrc machine")
- **PlayStuff** - **PlayStuff**
- **PlaysTV**
- **PlaySuisse** - **PlaySuisse**
- **Playtvak**: Playtvak.cz, iDNES.cz and Lidovky.cz - **Playtvak**: Playtvak.cz, iDNES.cz and Lidovky.cz
- **Playvid**
- **PlayVids** - **PlayVids**
- **Playwire** - **Playwire**
- **pluralsight**: [*pluralsight*](## "netrc machine") - **pluralsight**: [*pluralsight*](## "netrc machine")
@@ -1136,11 +1077,8 @@
- **Popcorntimes** - **Popcorntimes**
- **PopcornTV** - **PopcornTV**
- **Pornbox** - **Pornbox**
- **PornCom**
- **PornerBros** - **PornerBros**
- **Pornez**
- **PornFlip** - **PornFlip**
- **PornHd**
- **PornHub**: [*pornhub*](## "netrc machine") PornHub and Thumbzilla - **PornHub**: [*pornhub*](## "netrc machine") PornHub and Thumbzilla
- **PornHubPagedVideoList**: [*pornhub*](## "netrc machine") - **PornHubPagedVideoList**: [*pornhub*](## "netrc machine")
- **PornHubPlaylist**: [*pornhub*](## "netrc machine") - **PornHubPlaylist**: [*pornhub*](## "netrc machine")
@@ -1182,7 +1120,6 @@
- **Radiko** - **Radiko**
- **RadikoRadio** - **RadikoRadio**
- **radio.de** - **radio.de**
- **radiobremen**
- **radiocanada** - **radiocanada**
- **radiocanada:audiovideo** - **radiocanada:audiovideo**
- **RadioComercial** - **RadioComercial**
@@ -1222,7 +1159,6 @@
- **RCTIPlusSeries** - **RCTIPlusSeries**
- **RCTIPlusTV** - **RCTIPlusTV**
- **RDS**: RDS.ca - **RDS**: RDS.ca
- **Recurbate**
- **RedBull** - **RedBull**
- **RedBullEmbed** - **RedBullEmbed**
- **RedBullTV** - **RedBullTV**
@@ -1239,7 +1175,7 @@
- **Reuters** - **Reuters**
- **ReverbNation** - **ReverbNation**
- **RheinMainTV** - **RheinMainTV**
- **RICE** - **RinseFM**
- **RMCDecouverte** - **RMCDecouverte**
- **RockstarGames** - **RockstarGames**
- **Rokfin**: [*rokfin*](## "netrc machine") - **Rokfin**: [*rokfin*](## "netrc machine")
@@ -1260,8 +1196,6 @@
- **rtl.lu:tele-vod** - **rtl.lu:tele-vod**
- **rtl.nl**: rtl.nl and rtlxl.nl - **rtl.nl**: rtl.nl and rtlxl.nl
- **rtl2** - **rtl2**
- **rtl2:you**
- **rtl2:you:series**
- **RTLLuLive** - **RTLLuLive**
- **RTLLuRadio** - **RTLLuRadio**
- **RTNews** - **RTNews**
@@ -1276,10 +1210,9 @@
- **rtve.es:infantil**: RTVE infantil - **rtve.es:infantil**: RTVE infantil
- **rtve.es:live**: RTVE.es live streams - **rtve.es:live**: RTVE.es live streams
- **rtve.es:television** - **rtve.es:television**
- **RTVNH**
- **RTVS** - **RTVS**
- **rtvslo.si** - **rtvslo.si**
- **RUHD** - **RudoVideo**
- **Rule34Video** - **Rule34Video**
- **Rumble** - **Rumble**
- **RumbleChannel** - **RumbleChannel**
@@ -1326,8 +1259,8 @@
- **ScrippsNetworks** - **ScrippsNetworks**
- **scrippsnetworks:watch** - **scrippsnetworks:watch**
- **Scrolller** - **Scrolller**
- **SCTE**: [*scte*](## "netrc machine") - **SCTE**: [*scte*](## "netrc machine") (**Currently broken**)
- **SCTECourse**: [*scte*](## "netrc machine") - **SCTECourse**: [*scte*](## "netrc machine") (**Currently broken**)
- **Seeker** - **Seeker**
- **SenalColombiaLive** - **SenalColombiaLive**
- **SenateGov** - **SenateGov**
@@ -1339,7 +1272,6 @@
- **SeznamZpravyArticle** - **SeznamZpravyArticle**
- **Shahid**: [*shahid*](## "netrc machine") - **Shahid**: [*shahid*](## "netrc machine")
- **ShahidShow** - **ShahidShow**
- **Shared**: shared.sx
- **ShareVideosEmbed** - **ShareVideosEmbed**
- **ShemarooMe** - **ShemarooMe**
- **ShowRoomLive** - **ShowRoomLive**
@@ -1391,7 +1323,6 @@
- **SovietsClosetPlaylist** - **SovietsClosetPlaylist**
- **SpankBang** - **SpankBang**
- **SpankBangPlaylist** - **SpankBangPlaylist**
- **Spankwire**
- **Spiegel** - **Spiegel**
- **Sport5** - **Sport5**
- **SportBox** - **SportBox**
@@ -1404,7 +1335,7 @@
- **SpreakerShowPage** - **SpreakerShowPage**
- **SpringboardPlatform** - **SpringboardPlatform**
- **Sprout** - **Sprout**
- **sr:mediathek**: Saarländischer Rundfunk - **sr:mediathek**: Saarländischer Rundfunk (**Currently broken**)
- **SRGSSR** - **SRGSSR**
- **SRGSSRPlay**: srf.ch, rts.ch, rsi.ch, rtr.ch and swissinfo.ch play sites - **SRGSSRPlay**: srf.ch, rts.ch, rsi.ch, rtr.ch and swissinfo.ch play sites
- **StacommuLive**: [*stacommu*](## "netrc machine") - **StacommuLive**: [*stacommu*](## "netrc machine")
@@ -1421,7 +1352,6 @@
- **StoryFireSeries** - **StoryFireSeries**
- **StoryFireUser** - **StoryFireUser**
- **Streamable** - **Streamable**
- **streamcloud.eu**
- **StreamCZ** - **StreamCZ**
- **StreamFF** - **StreamFF**
- **StreetVoice** - **StreetVoice**
@@ -1437,7 +1367,6 @@
- **SVTPlay**: SVT Play and Öppet arkiv - **SVTPlay**: SVT Play and Öppet arkiv
- **SVTSeries** - **SVTSeries**
- **SwearnetEpisode** - **SwearnetEpisode**
- **SWRMediathek**
- **Syfy** - **Syfy**
- **SYVDK** - **SYVDK**
- **SztvHu** - **SztvHu**
@@ -1456,7 +1385,6 @@
- **TeachingChannel** - **TeachingChannel**
- **Teamcoco** - **Teamcoco**
- **TeamTreeHouse**: [*teamtreehouse*](## "netrc machine") - **TeamTreeHouse**: [*teamtreehouse*](## "netrc machine")
- **TechTalks**
- **techtv.mit.edu** - **techtv.mit.edu**
- **TedEmbed** - **TedEmbed**
- **TedPlaylist** - **TedPlaylist**
@@ -1486,6 +1414,8 @@
- **TFO** - **TFO**
- **theatercomplextown:ppv**: [*theatercomplextown*](## "netrc machine") - **theatercomplextown:ppv**: [*theatercomplextown*](## "netrc machine")
- **theatercomplextown:vod**: [*theatercomplextown*](## "netrc machine") - **theatercomplextown:vod**: [*theatercomplextown*](## "netrc machine")
- **TheGuardianPodcast**
- **TheGuardianPodcastPlaylist**
- **TheHoleTv** - **TheHoleTv**
- **TheIntercept** - **TheIntercept**
- **ThePlatform** - **ThePlatform**
@@ -1506,27 +1436,23 @@
- **tiktok:sound**: (**Currently broken**) - **tiktok:sound**: (**Currently broken**)
- **tiktok:tag**: (**Currently broken**) - **tiktok:tag**: (**Currently broken**)
- **tiktok:user**: (**Currently broken**) - **tiktok:user**: (**Currently broken**)
- **tinypic**: tinypic.com videos
- **TLC** - **TLC**
- **TMZ** - **TMZ**
- **TNAFlix** - **TNAFlix**
- **TNAFlixNetworkEmbed** - **TNAFlixNetworkEmbed**
- **toggle** - **toggle**
- **toggo** - **toggo**
- **Tokentube**
- **Tokentube:channel**
- **tokfm:audition** - **tokfm:audition**
- **tokfm:podcast** - **tokfm:podcast**
- **ToonGoggles** - **ToonGoggles**
- **tou.tv**: [*toutv*](## "netrc machine") - **tou.tv**: [*toutv*](## "netrc machine")
- **Toypics**: Toypics video - **Toypics**: Toypics video (**Currently broken**)
- **ToypicsUser**: Toypics user profile - **ToypicsUser**: Toypics user profile (**Currently broken**)
- **TrailerAddict**: (**Currently broken**) - **TrailerAddict**: (**Currently broken**)
- **TravelChannel** - **TravelChannel**
- **Triller**: [*triller*](## "netrc machine") - **Triller**: [*triller*](## "netrc machine")
- **TrillerShort** - **TrillerShort**
- **TrillerUser**: [*triller*](## "netrc machine") - **TrillerUser**: [*triller*](## "netrc machine")
- **Trilulilu**
- **Trovo** - **Trovo**
- **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix - **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix
- **TrovoChannelVod**: All VODs of a trovo.live channel; "trovovod:" prefix - **TrovoChannelVod**: All VODs of a trovo.live channel; "trovovod:" prefix
@@ -1536,7 +1462,7 @@
- **TruNews** - **TruNews**
- **Truth** - **Truth**
- **TruTV** - **TruTV**
- **Tube8** - **Tube8**: (**Currently broken**)
- **TubeTuGraz**: [*tubetugraz*](## "netrc machine") tube.tugraz.at - **TubeTuGraz**: [*tubetugraz*](## "netrc machine") tube.tugraz.at
- **TubeTuGrazSeries**: [*tubetugraz*](## "netrc machine") - **TubeTuGrazSeries**: [*tubetugraz*](## "netrc machine")
- **TubiTv**: [*tubitv*](## "netrc machine") - **TubiTv**: [*tubitv*](## "netrc machine")
@@ -1545,7 +1471,6 @@
- **TuneInPodcast** - **TuneInPodcast**
- **TuneInPodcastEpisode** - **TuneInPodcastEpisode**
- **TuneInStation** - **TuneInStation**
- **TunePk**
- **Turbo** - **Turbo**
- **tv.dfb.de** - **tv.dfb.de**
- **TV2** - **TV2**
@@ -1569,14 +1494,7 @@
- **TVIPlayer** - **TVIPlayer**
- **tvland.com** - **tvland.com**
- **TVN24** - **TVN24**
- **TVNet**
- **TVNoe** - **TVNoe**
- **TVNow**
- **TVNowAnnual**
- **TVNowFilm**
- **TVNowNew**
- **TVNowSeason**
- **TVNowShow**
- **tvopengr:embed**: tvopen.gr embedded videos - **tvopengr:embed**: tvopen.gr embedded videos
- **tvopengr:watch**: tvopen.gr (and ethnos.gr) videos - **tvopengr:watch**: tvopen.gr (and ethnos.gr) videos
- **tvp**: Telewizja Polska - **tvp**: Telewizja Polska
@@ -1614,7 +1532,6 @@
- **umg:de**: Universal Music Deutschland - **umg:de**: Universal Music Deutschland
- **Unistra** - **Unistra**
- **Unity** - **Unity**
- **UnscriptedNewsVideo**
- **uol.com.br** - **uol.com.br**
- **uplynk** - **uplynk**
- **uplynk:preplay** - **uplynk:preplay**
@@ -1629,7 +1546,6 @@
- **Utreon** - **Utreon**
- **Varzesh3** - **Varzesh3**
- **Vbox7** - **Vbox7**
- **VeeHD**
- **Veo** - **Veo**
- **Veoh** - **Veoh**
- **veoh:user** - **veoh:user**
@@ -1642,7 +1558,6 @@
- **vice** - **vice**
- **vice:article** - **vice:article**
- **vice:show** - **vice:show**
- **Vidbit**
- **Viddler** - **Viddler**
- **Videa** - **Videa**
- **video.arnes.si**: Arnes Video - **video.arnes.si**: Arnes Video
@@ -1664,6 +1579,7 @@
- **VidioLive**: [*vidio*](## "netrc machine") - **VidioLive**: [*vidio*](## "netrc machine")
- **VidioPremier**: [*vidio*](## "netrc machine") - **VidioPremier**: [*vidio*](## "netrc machine")
- **VidLii** - **VidLii**
- **Vidly**
- **viewlift** - **viewlift**
- **viewlift:embed** - **viewlift:embed**
- **Viidea** - **Viidea**
@@ -1683,7 +1599,6 @@
- **Vimm:stream** - **Vimm:stream**
- **ViMP** - **ViMP**
- **ViMP:Playlist** - **ViMP:Playlist**
- **Vimple**: Vimple - one-click video hosting
- **Vine** - **Vine**
- **vine:user** - **vine:user**
- **Viqeo** - **Viqeo**
@@ -1691,7 +1606,6 @@
- **viu:ott**: [*viu*](## "netrc machine") - **viu:ott**: [*viu*](## "netrc machine")
- **viu:playlist** - **viu:playlist**
- **ViuOTTIndonesia** - **ViuOTTIndonesia**
- **Vivo**: vivo.sx
- **vk**: [*vk*](## "netrc machine") VK - **vk**: [*vk*](## "netrc machine") VK
- **vk:uservideos**: [*vk*](## "netrc machine") VK - User's Videos - **vk:uservideos**: [*vk*](## "netrc machine") VK - User's Videos
- **vk:wallpost**: [*vk*](## "netrc machine") - **vk:wallpost**: [*vk*](## "netrc machine")
@@ -1699,37 +1613,27 @@
- **VKPlayLive** - **VKPlayLive**
- **vm.tiktok** - **vm.tiktok**
- **Vocaroo** - **Vocaroo**
- **Vodlocker**
- **VODPl** - **VODPl**
- **VODPlatform** - **VODPlatform**
- **VoiceRepublic**
- **voicy** - **voicy**
- **voicy:channel** - **voicy:channel**
- **VolejTV** - **VolejTV**
- **Voot**: [*voot*](## "netrc machine") - **Voot**: [*voot*](## "netrc machine") (**Currently broken**)
- **VootSeries**: [*voot*](## "netrc machine") - **VootSeries**: [*voot*](## "netrc machine") (**Currently broken**)
- **VoxMedia** - **VoxMedia**
- **VoxMediaVolume** - **VoxMediaVolume**
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **vqq:series** - **vqq:series**
- **vqq:video** - **vqq:video**
- **Vrak**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza - **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **VrtNU**: [*vrtnu*](## "netrc machine") VRT MAX - **VrtNU**: [*vrtnu*](## "netrc machine") VRT MAX
- **vrv**: [*vrv*](## "netrc machine")
- **vrv:series**
- **VShare**
- **VTM** - **VTM**
- **VTXTV**: [*vtxtv*](## "netrc machine") - **VTXTV**: [*vtxtv*](## "netrc machine")
- **VTXTVLive**: [*vtxtv*](## "netrc machine") - **VTXTVLive**: [*vtxtv*](## "netrc machine")
- **VTXTVRecordings**: [*vtxtv*](## "netrc machine") - **VTXTVRecordings**: [*vtxtv*](## "netrc machine")
- **VuClip** - **VuClip**
- **Vupload**
- **VVVVID** - **VVVVID**
- **VVVVIDShow** - **VVVVIDShow**
- **VyboryMos**
- **Vzaar**
- **Wakanim**
- **Walla** - **Walla**
- **WalyTV**: [*walytv*](## "netrc machine") - **WalyTV**: [*walytv*](## "netrc machine")
- **WalyTVLive**: [*walytv*](## "netrc machine") - **WalyTVLive**: [*walytv*](## "netrc machine")
@@ -1740,9 +1644,7 @@
- **washingtonpost** - **washingtonpost**
- **washingtonpost:article** - **washingtonpost:article**
- **wat.tv** - **wat.tv**
- **WatchBox**
- **WatchESPN** - **WatchESPN**
- **WatchIndianPorn**: Watch Indian Porn
- **WDR** - **WDR**
- **wdr:mobile**: (**Currently broken**) - **wdr:mobile**: (**Currently broken**)
- **WDRElefant** - **WDRElefant**
@@ -1770,7 +1672,6 @@
- **whowatch** - **whowatch**
- **Whyp** - **Whyp**
- **wikimedia.org** - **wikimedia.org**
- **Willow**
- **Wimbledon** - **Wimbledon**
- **WimTV** - **WimTV**
- **WinSportsVideo** - **WinSportsVideo**
@@ -1795,7 +1696,6 @@
- **wykop:post** - **wykop:post**
- **wykop:post:comment** - **wykop:post:comment**
- **Xanimu** - **Xanimu**
- **XBef**
- **XboxClips** - **XboxClips**
- **XFileShare**: XFileShare based sites: Aparat, ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, WolfStream, XVideoSharing - **XFileShare**: XFileShare based sites: Aparat, ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, WolfStream, XVideoSharing
- **XHamster** - **XHamster**
@@ -1807,9 +1707,6 @@
- **XMinus** - **XMinus**
- **XNXX** - **XNXX**
- **Xstream** - **Xstream**
- **XTube**
- **XTubeUser**: XTube user profile
- **Xuite**: 隨意窩Xuite影音
- **XVideos** - **XVideos**
- **xvideos:quickies** - **xvideos:quickies**
- **XXXYMovies** - **XXXYMovies**
@@ -1826,10 +1723,7 @@
- **YapFiles** - **YapFiles**
- **Yappy** - **Yappy**
- **YappyProfile** - **YappyProfile**
- **YesJapan**
- **yinyuetai:video**: 音悦Tai
- **YleAreena** - **YleAreena**
- **Ynet**
- **YouJizz** - **YouJizz**
- **youku**: 优酷 - **youku**: 优酷
- **youku:show** - **youku:show**
@@ -1877,6 +1771,9 @@
- **zingmp3:chart-home** - **zingmp3:chart-home**
- **zingmp3:chart-music-video** - **zingmp3:chart-music-video**
- **zingmp3:hub** - **zingmp3:hub**
- **zingmp3:liveradio**
- **zingmp3:podcast**
- **zingmp3:podcast-episode**
- **zingmp3:user** - **zingmp3:user**
- **zingmp3:week-chart** - **zingmp3:week-chart**
- **zoom** - **zoom**

View File

@@ -19,3 +19,8 @@ def handler(request):
pytest.skip(f'{RH_KEY} request handler is not available') pytest.skip(f'{RH_KEY} request handler is not available')
return functools.partial(handler, logger=FakeLogger) return functools.partial(handler, logger=FakeLogger)
def validate_and_send(rh, req):
rh.validate(req)
return rh.send(req)

View File

@@ -10,7 +10,7 @@ import types
import yt_dlp.extractor import yt_dlp.extractor
from yt_dlp import YoutubeDL from yt_dlp import YoutubeDL
from yt_dlp.compat import compat_os_name from yt_dlp.compat import compat_os_name
from yt_dlp.utils import preferredencoding, write_string from yt_dlp.utils import preferredencoding, try_call, write_string
if 'pytest' in sys.modules: if 'pytest' in sys.modules:
import pytest import pytest
@@ -214,14 +214,19 @@ def sanitize_got_info_dict(got_dict):
test_info_dict = { test_info_dict = {
key: sanitize(key, value) for key, value in got_dict.items() key: sanitize(key, value) for key, value in got_dict.items()
if value is not None and key not in IGNORED_FIELDS and not any( if value is not None and key not in IGNORED_FIELDS and (
key.startswith(f'{prefix}_') for prefix in IGNORED_PREFIXES) not any(key.startswith(f'{prefix}_') for prefix in IGNORED_PREFIXES)
or key == '_old_archive_ids')
} }
# display_id may be generated from id # display_id may be generated from id
if test_info_dict.get('display_id') == test_info_dict.get('id'): if test_info_dict.get('display_id') == test_info_dict.get('id'):
test_info_dict.pop('display_id') test_info_dict.pop('display_id')
# release_year may be generated from release_date
if try_call(lambda: test_info_dict['release_year'] == int(test_info_dict['release_date'][:4])):
test_info_dict.pop('release_year')
# Check url for flat entries # Check url for flat entries
if got_dict.get('_type', 'video') != 'video' and got_dict.get('url'): if got_dict.get('_type', 'video') != 'video' and got_dict.get('url'):
test_info_dict['url'] = got_dict['url'] test_info_dict['url'] = got_dict['url']

View File

@@ -140,6 +140,8 @@ class TestFormatSelection(unittest.TestCase):
test('example-with-dashes', 'example-with-dashes') test('example-with-dashes', 'example-with-dashes')
test('all', '2', '47', '45', 'example-with-dashes', '35') test('all', '2', '47', '45', 'example-with-dashes', '35')
test('mergeall', '2+47+45+example-with-dashes+35', multi=True) test('mergeall', '2+47+45+example-with-dashes+35', multi=True)
# See: https://github.com/yt-dlp/yt-dlp/pulls/8797
test('7_a/worst', '35')
def test_format_selection_audio(self): def test_format_selection_audio(self):
formats = [ formats = [
@@ -728,7 +730,7 @@ class TestYoutubeDL(unittest.TestCase):
self.assertEqual(got_dict.get(info_field), expected, info_field) self.assertEqual(got_dict.get(info_field), expected, info_field)
return True return True
test('%()j', (expect_same_infodict, str)) test('%()j', (expect_same_infodict, None))
# NA placeholder # NA placeholder
NA_TEST_OUTTMPL = '%(uploader_date)s-%(width)d-%(x|def)s-%(id)s.%(ext)s' NA_TEST_OUTTMPL = '%(uploader_date)s-%(width)d-%(x|def)s-%(id)s.%(ext)s'
@@ -797,6 +799,7 @@ class TestYoutubeDL(unittest.TestCase):
test('%(title|%)s %(title|%%)s', '% %%') test('%(title|%)s %(title|%%)s', '% %%')
test('%(id+1-height+3)05d', '00158') test('%(id+1-height+3)05d', '00158')
test('%(width+100)05d', 'NA') test('%(width+100)05d', 'NA')
test('%(filesize*8)d', '8192')
test('%(formats.0) 15s', ('% 15s' % FORMATS[0], None)) test('%(formats.0) 15s', ('% 15s' % FORMATS[0], None))
test('%(formats.0)r', (repr(FORMATS[0]), None)) test('%(formats.0)r', (repr(FORMATS[0]), None))
test('%(height.0)03d', '001') test('%(height.0)03d', '001')

View File

@@ -52,6 +52,8 @@ from yt_dlp.networking.exceptions import (
from yt_dlp.utils._utils import _YDLLogger as FakeLogger from yt_dlp.utils._utils import _YDLLogger as FakeLogger
from yt_dlp.utils.networking import HTTPHeaderDict from yt_dlp.utils.networking import HTTPHeaderDict
from test.conftest import validate_and_send
TEST_DIR = os.path.dirname(os.path.abspath(__file__)) TEST_DIR = os.path.dirname(os.path.abspath(__file__))
@@ -275,11 +277,6 @@ class HTTPTestRequestHandler(http.server.BaseHTTPRequestHandler):
self._headers_buffer.append(f'{keyword}: {value}\r\n'.encode()) self._headers_buffer.append(f'{keyword}: {value}\r\n'.encode())
def validate_and_send(rh, req):
rh.validate(req)
return rh.send(req)
class TestRequestHandlerBase: class TestRequestHandlerBase:
@classmethod @classmethod
def setup_class(cls): def setup_class(cls):
@@ -331,7 +328,7 @@ class TestHTTPRequestHandler(TestRequestHandlerBase):
https_server_thread.start() https_server_thread.start()
with handler(verify=False) as rh: with handler(verify=False) as rh:
with pytest.raises(SSLError, match='sslv3 alert handshake failure') as exc_info: with pytest.raises(SSLError, match=r'ssl(?:v3|/tls) alert handshake failure') as exc_info:
validate_and_send(rh, Request(f'https://127.0.0.1:{https_port}/headers')) validate_and_send(rh, Request(f'https://127.0.0.1:{https_port}/headers'))
assert not issubclass(exc_info.type, CertificateVerifyError) assert not issubclass(exc_info.type, CertificateVerifyError)
@@ -872,8 +869,9 @@ class TestRequestsRequestHandler(TestRequestHandlerBase):
]) ])
@pytest.mark.parametrize('handler', ['Requests'], indirect=True) @pytest.mark.parametrize('handler', ['Requests'], indirect=True)
def test_response_error_mapping(self, handler, monkeypatch, raised, expected, match): def test_response_error_mapping(self, handler, monkeypatch, raised, expected, match):
from urllib3.response import HTTPResponse as Urllib3Response
from requests.models import Response as RequestsResponse from requests.models import Response as RequestsResponse
from urllib3.response import HTTPResponse as Urllib3Response
from yt_dlp.networking._requests import RequestsResponseAdapter from yt_dlp.networking._requests import RequestsResponseAdapter
requests_res = RequestsResponse() requests_res = RequestsResponse()
requests_res.raw = Urllib3Response(body=b'', status=200) requests_res.raw = Urllib3Response(body=b'', status=200)
@@ -929,13 +927,17 @@ class TestRequestHandlerValidation:
('http', False, {}), ('http', False, {}),
('https', False, {}), ('https', False, {}),
]), ]),
('Websockets', [
('ws', False, {}),
('wss', False, {}),
]),
(NoCheckRH, [('http', False, {})]), (NoCheckRH, [('http', False, {})]),
(ValidationRH, [('http', UnsupportedRequest, {})]) (ValidationRH, [('http', UnsupportedRequest, {})])
] ]
PROXY_SCHEME_TESTS = [ PROXY_SCHEME_TESTS = [
# scheme, expected to fail # scheme, expected to fail
('Urllib', [ ('Urllib', 'http', [
('http', False), ('http', False),
('https', UnsupportedRequest), ('https', UnsupportedRequest),
('socks4', False), ('socks4', False),
@@ -944,7 +946,7 @@ class TestRequestHandlerValidation:
('socks5h', False), ('socks5h', False),
('socks', UnsupportedRequest), ('socks', UnsupportedRequest),
]), ]),
('Requests', [ ('Requests', 'http', [
('http', False), ('http', False),
('https', False), ('https', False),
('socks4', False), ('socks4', False),
@@ -952,8 +954,11 @@ class TestRequestHandlerValidation:
('socks5', False), ('socks5', False),
('socks5h', False), ('socks5h', False),
]), ]),
(NoCheckRH, [('http', False)]), (NoCheckRH, 'http', [('http', False)]),
(HTTPSupportedRH, [('http', UnsupportedRequest)]), (HTTPSupportedRH, 'http', [('http', UnsupportedRequest)]),
('Websockets', 'ws', [('http', UnsupportedRequest)]),
(NoCheckRH, 'http', [('http', False)]),
(HTTPSupportedRH, 'http', [('http', UnsupportedRequest)]),
] ]
PROXY_KEY_TESTS = [ PROXY_KEY_TESTS = [
@@ -972,7 +977,7 @@ class TestRequestHandlerValidation:
] ]
EXTENSION_TESTS = [ EXTENSION_TESTS = [
('Urllib', [ ('Urllib', 'http', [
({'cookiejar': 'notacookiejar'}, AssertionError), ({'cookiejar': 'notacookiejar'}, AssertionError),
({'cookiejar': YoutubeDLCookieJar()}, False), ({'cookiejar': YoutubeDLCookieJar()}, False),
({'cookiejar': CookieJar()}, AssertionError), ({'cookiejar': CookieJar()}, AssertionError),
@@ -980,17 +985,21 @@ class TestRequestHandlerValidation:
({'timeout': 'notatimeout'}, AssertionError), ({'timeout': 'notatimeout'}, AssertionError),
({'unsupported': 'value'}, UnsupportedRequest), ({'unsupported': 'value'}, UnsupportedRequest),
]), ]),
('Requests', [ ('Requests', 'http', [
({'cookiejar': 'notacookiejar'}, AssertionError), ({'cookiejar': 'notacookiejar'}, AssertionError),
({'cookiejar': YoutubeDLCookieJar()}, False), ({'cookiejar': YoutubeDLCookieJar()}, False),
({'timeout': 1}, False), ({'timeout': 1}, False),
({'timeout': 'notatimeout'}, AssertionError), ({'timeout': 'notatimeout'}, AssertionError),
({'unsupported': 'value'}, UnsupportedRequest), ({'unsupported': 'value'}, UnsupportedRequest),
]), ]),
(NoCheckRH, [ (NoCheckRH, 'http', [
({'cookiejar': 'notacookiejar'}, False), ({'cookiejar': 'notacookiejar'}, False),
({'somerandom': 'test'}, False), # but any extension is allowed through ({'somerandom': 'test'}, False), # but any extension is allowed through
]), ]),
('Websockets', 'ws', [
({'cookiejar': YoutubeDLCookieJar()}, False),
({'timeout': 2}, False),
]),
] ]
@pytest.mark.parametrize('handler,scheme,fail,handler_kwargs', [ @pytest.mark.parametrize('handler,scheme,fail,handler_kwargs', [
@@ -1016,14 +1025,14 @@ class TestRequestHandlerValidation:
run_validation(handler, fail, Request('http://', proxies={proxy_key: 'http://example.com'})) run_validation(handler, fail, Request('http://', proxies={proxy_key: 'http://example.com'}))
run_validation(handler, fail, Request('http://'), proxies={proxy_key: 'http://example.com'}) run_validation(handler, fail, Request('http://'), proxies={proxy_key: 'http://example.com'})
@pytest.mark.parametrize('handler,scheme,fail', [ @pytest.mark.parametrize('handler,req_scheme,scheme,fail', [
(handler_tests[0], scheme, fail) (handler_tests[0], handler_tests[1], scheme, fail)
for handler_tests in PROXY_SCHEME_TESTS for handler_tests in PROXY_SCHEME_TESTS
for scheme, fail in handler_tests[1] for scheme, fail in handler_tests[2]
], indirect=['handler']) ], indirect=['handler'])
def test_proxy_scheme(self, handler, scheme, fail): def test_proxy_scheme(self, handler, req_scheme, scheme, fail):
run_validation(handler, fail, Request('http://', proxies={'http': f'{scheme}://example.com'})) run_validation(handler, fail, Request(f'{req_scheme}://', proxies={req_scheme: f'{scheme}://example.com'}))
run_validation(handler, fail, Request('http://'), proxies={'http': f'{scheme}://example.com'}) run_validation(handler, fail, Request(f'{req_scheme}://'), proxies={req_scheme: f'{scheme}://example.com'})
@pytest.mark.parametrize('handler', ['Urllib', HTTPSupportedRH, 'Requests'], indirect=True) @pytest.mark.parametrize('handler', ['Urllib', HTTPSupportedRH, 'Requests'], indirect=True)
def test_empty_proxy(self, handler): def test_empty_proxy(self, handler):
@@ -1035,14 +1044,14 @@ class TestRequestHandlerValidation:
def test_invalid_proxy_url(self, handler, proxy_url): def test_invalid_proxy_url(self, handler, proxy_url):
run_validation(handler, UnsupportedRequest, Request('http://', proxies={'http': proxy_url})) run_validation(handler, UnsupportedRequest, Request('http://', proxies={'http': proxy_url}))
@pytest.mark.parametrize('handler,extensions,fail', [ @pytest.mark.parametrize('handler,scheme,extensions,fail', [
(handler_tests[0], extensions, fail) (handler_tests[0], handler_tests[1], extensions, fail)
for handler_tests in EXTENSION_TESTS for handler_tests in EXTENSION_TESTS
for extensions, fail in handler_tests[1] for extensions, fail in handler_tests[2]
], indirect=['handler']) ], indirect=['handler'])
def test_extension(self, handler, extensions, fail): def test_extension(self, handler, scheme, extensions, fail):
run_validation( run_validation(
handler, fail, Request('http://', extensions=extensions)) handler, fail, Request(f'{scheme}://', extensions=extensions))
def test_invalid_request_type(self): def test_invalid_request_type(self):
rh = self.ValidationRH(logger=FakeLogger()) rh = self.ValidationRH(logger=FakeLogger())
@@ -1075,6 +1084,22 @@ class FakeRHYDL(FakeYDL):
self._request_director = self.build_request_director([FakeRH]) self._request_director = self.build_request_director([FakeRH])
class AllUnsupportedRHYDL(FakeYDL):
def __init__(self, *args, **kwargs):
class UnsupportedRH(RequestHandler):
def _send(self, request: Request):
pass
_SUPPORTED_FEATURES = ()
_SUPPORTED_PROXY_SCHEMES = ()
_SUPPORTED_URL_SCHEMES = ()
super().__init__(*args, **kwargs)
self._request_director = self.build_request_director([UnsupportedRH])
class TestRequestDirector: class TestRequestDirector:
def test_handler_operations(self): def test_handler_operations(self):
@@ -1234,6 +1259,12 @@ class TestYoutubeDLNetworking:
with pytest.raises(RequestError, match=r'file:// URLs are disabled by default'): with pytest.raises(RequestError, match=r'file:// URLs are disabled by default'):
ydl.urlopen('file://') ydl.urlopen('file://')
@pytest.mark.parametrize('scheme', (['ws', 'wss']))
def test_websocket_unavailable_error(self, scheme):
with AllUnsupportedRHYDL() as ydl:
with pytest.raises(RequestError, match=r'This request requires WebSocket support'):
ydl.urlopen(f'{scheme}://')
def test_legacy_server_connect_error(self): def test_legacy_server_connect_error(self):
with FakeRHYDL() as ydl: with FakeRHYDL() as ydl:
for error in ('UNSAFE_LEGACY_RENEGOTIATION_DISABLED', 'SSLV3_ALERT_HANDSHAKE_FAILURE'): for error in ('UNSAFE_LEGACY_RENEGOTIATION_DISABLED', 'SSLV3_ALERT_HANDSHAKE_FAILURE'):

View File

@@ -210,6 +210,16 @@ class SocksHTTPTestRequestHandler(http.server.BaseHTTPRequestHandler, SocksTestR
self.wfile.write(payload.encode()) self.wfile.write(payload.encode())
class SocksWebSocketTestRequestHandler(SocksTestRequestHandler):
def handle(self):
import websockets.sync.server
protocol = websockets.ServerProtocol()
connection = websockets.sync.server.ServerConnection(socket=self.request, protocol=protocol, close_timeout=0)
connection.handshake()
connection.send(json.dumps(self.socks_info))
connection.close()
@contextlib.contextmanager @contextlib.contextmanager
def socks_server(socks_server_class, request_handler, bind_ip=None, **socks_server_kwargs): def socks_server(socks_server_class, request_handler, bind_ip=None, **socks_server_kwargs):
server = server_thread = None server = server_thread = None
@@ -252,8 +262,22 @@ class HTTPSocksTestProxyContext(SocksProxyTestContext):
return json.loads(handler.send(request).read().decode()) return json.loads(handler.send(request).read().decode())
class WebSocketSocksTestProxyContext(SocksProxyTestContext):
REQUEST_HANDLER_CLASS = SocksWebSocketTestRequestHandler
def socks_info_request(self, handler, target_domain=None, target_port=None, **req_kwargs):
request = Request(f'ws://{target_domain or "127.0.0.1"}:{target_port or "40000"}', **req_kwargs)
handler.validate(request)
ws = handler.send(request)
ws.send('socks_info')
socks_info = ws.recv()
ws.close()
return json.loads(socks_info)
CTX_MAP = { CTX_MAP = {
'http': HTTPSocksTestProxyContext, 'http': HTTPSocksTestProxyContext,
'ws': WebSocketSocksTestProxyContext,
} }
@@ -263,7 +287,7 @@ def ctx(request):
class TestSocks4Proxy: class TestSocks4Proxy:
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks4_no_auth(self, handler, ctx): def test_socks4_no_auth(self, handler, ctx):
with handler() as rh: with handler() as rh:
with ctx.socks_server(Socks4ProxyHandler) as server_address: with ctx.socks_server(Socks4ProxyHandler) as server_address:
@@ -271,7 +295,7 @@ class TestSocks4Proxy:
rh, proxies={'all': f'socks4://{server_address}'}) rh, proxies={'all': f'socks4://{server_address}'})
assert response['version'] == 4 assert response['version'] == 4
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks4_auth(self, handler, ctx): def test_socks4_auth(self, handler, ctx):
with handler() as rh: with handler() as rh:
with ctx.socks_server(Socks4ProxyHandler, user_id='user') as server_address: with ctx.socks_server(Socks4ProxyHandler, user_id='user') as server_address:
@@ -281,7 +305,7 @@ class TestSocks4Proxy:
rh, proxies={'all': f'socks4://user:@{server_address}'}) rh, proxies={'all': f'socks4://user:@{server_address}'})
assert response['version'] == 4 assert response['version'] == 4
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks4a_ipv4_target(self, handler, ctx): def test_socks4a_ipv4_target(self, handler, ctx):
with ctx.socks_server(Socks4ProxyHandler) as server_address: with ctx.socks_server(Socks4ProxyHandler) as server_address:
with handler(proxies={'all': f'socks4a://{server_address}'}) as rh: with handler(proxies={'all': f'socks4a://{server_address}'}) as rh:
@@ -289,7 +313,7 @@ class TestSocks4Proxy:
assert response['version'] == 4 assert response['version'] == 4
assert (response['ipv4_address'] == '127.0.0.1') != (response['domain_address'] == '127.0.0.1') assert (response['ipv4_address'] == '127.0.0.1') != (response['domain_address'] == '127.0.0.1')
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks4a_domain_target(self, handler, ctx): def test_socks4a_domain_target(self, handler, ctx):
with ctx.socks_server(Socks4ProxyHandler) as server_address: with ctx.socks_server(Socks4ProxyHandler) as server_address:
with handler(proxies={'all': f'socks4a://{server_address}'}) as rh: with handler(proxies={'all': f'socks4a://{server_address}'}) as rh:
@@ -298,7 +322,7 @@ class TestSocks4Proxy:
assert response['ipv4_address'] is None assert response['ipv4_address'] is None
assert response['domain_address'] == 'localhost' assert response['domain_address'] == 'localhost'
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_ipv4_client_source_address(self, handler, ctx): def test_ipv4_client_source_address(self, handler, ctx):
with ctx.socks_server(Socks4ProxyHandler) as server_address: with ctx.socks_server(Socks4ProxyHandler) as server_address:
source_address = f'127.0.0.{random.randint(5, 255)}' source_address = f'127.0.0.{random.randint(5, 255)}'
@@ -308,7 +332,7 @@ class TestSocks4Proxy:
assert response['client_address'][0] == source_address assert response['client_address'][0] == source_address
assert response['version'] == 4 assert response['version'] == 4
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
@pytest.mark.parametrize('reply_code', [ @pytest.mark.parametrize('reply_code', [
Socks4CD.REQUEST_REJECTED_OR_FAILED, Socks4CD.REQUEST_REJECTED_OR_FAILED,
Socks4CD.REQUEST_REJECTED_CANNOT_CONNECT_TO_IDENTD, Socks4CD.REQUEST_REJECTED_CANNOT_CONNECT_TO_IDENTD,
@@ -320,7 +344,7 @@ class TestSocks4Proxy:
with pytest.raises(ProxyError): with pytest.raises(ProxyError):
ctx.socks_info_request(rh) ctx.socks_info_request(rh)
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_ipv6_socks4_proxy(self, handler, ctx): def test_ipv6_socks4_proxy(self, handler, ctx):
with ctx.socks_server(Socks4ProxyHandler, bind_ip='::1') as server_address: with ctx.socks_server(Socks4ProxyHandler, bind_ip='::1') as server_address:
with handler(proxies={'all': f'socks4://{server_address}'}) as rh: with handler(proxies={'all': f'socks4://{server_address}'}) as rh:
@@ -329,7 +353,7 @@ class TestSocks4Proxy:
assert response['ipv4_address'] == '127.0.0.1' assert response['ipv4_address'] == '127.0.0.1'
assert response['version'] == 4 assert response['version'] == 4
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_timeout(self, handler, ctx): def test_timeout(self, handler, ctx):
with ctx.socks_server(Socks4ProxyHandler, sleep=2) as server_address: with ctx.socks_server(Socks4ProxyHandler, sleep=2) as server_address:
with handler(proxies={'all': f'socks4://{server_address}'}, timeout=0.5) as rh: with handler(proxies={'all': f'socks4://{server_address}'}, timeout=0.5) as rh:
@@ -339,7 +363,7 @@ class TestSocks4Proxy:
class TestSocks5Proxy: class TestSocks5Proxy:
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks5_no_auth(self, handler, ctx): def test_socks5_no_auth(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler) as server_address: with ctx.socks_server(Socks5ProxyHandler) as server_address:
with handler(proxies={'all': f'socks5://{server_address}'}) as rh: with handler(proxies={'all': f'socks5://{server_address}'}) as rh:
@@ -347,7 +371,7 @@ class TestSocks5Proxy:
assert response['auth_methods'] == [0x0] assert response['auth_methods'] == [0x0]
assert response['version'] == 5 assert response['version'] == 5
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks5_user_pass(self, handler, ctx): def test_socks5_user_pass(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler, auth=('test', 'testpass')) as server_address: with ctx.socks_server(Socks5ProxyHandler, auth=('test', 'testpass')) as server_address:
with handler() as rh: with handler() as rh:
@@ -360,7 +384,7 @@ class TestSocks5Proxy:
assert response['auth_methods'] == [Socks5Auth.AUTH_NONE, Socks5Auth.AUTH_USER_PASS] assert response['auth_methods'] == [Socks5Auth.AUTH_NONE, Socks5Auth.AUTH_USER_PASS]
assert response['version'] == 5 assert response['version'] == 5
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks5_ipv4_target(self, handler, ctx): def test_socks5_ipv4_target(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler) as server_address: with ctx.socks_server(Socks5ProxyHandler) as server_address:
with handler(proxies={'all': f'socks5://{server_address}'}) as rh: with handler(proxies={'all': f'socks5://{server_address}'}) as rh:
@@ -368,7 +392,7 @@ class TestSocks5Proxy:
assert response['ipv4_address'] == '127.0.0.1' assert response['ipv4_address'] == '127.0.0.1'
assert response['version'] == 5 assert response['version'] == 5
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks5_domain_target(self, handler, ctx): def test_socks5_domain_target(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler) as server_address: with ctx.socks_server(Socks5ProxyHandler) as server_address:
with handler(proxies={'all': f'socks5://{server_address}'}) as rh: with handler(proxies={'all': f'socks5://{server_address}'}) as rh:
@@ -376,7 +400,7 @@ class TestSocks5Proxy:
assert (response['ipv4_address'] == '127.0.0.1') != (response['ipv6_address'] == '::1') assert (response['ipv4_address'] == '127.0.0.1') != (response['ipv6_address'] == '::1')
assert response['version'] == 5 assert response['version'] == 5
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks5h_domain_target(self, handler, ctx): def test_socks5h_domain_target(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler) as server_address: with ctx.socks_server(Socks5ProxyHandler) as server_address:
with handler(proxies={'all': f'socks5h://{server_address}'}) as rh: with handler(proxies={'all': f'socks5h://{server_address}'}) as rh:
@@ -385,7 +409,7 @@ class TestSocks5Proxy:
assert response['domain_address'] == 'localhost' assert response['domain_address'] == 'localhost'
assert response['version'] == 5 assert response['version'] == 5
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks5h_ip_target(self, handler, ctx): def test_socks5h_ip_target(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler) as server_address: with ctx.socks_server(Socks5ProxyHandler) as server_address:
with handler(proxies={'all': f'socks5h://{server_address}'}) as rh: with handler(proxies={'all': f'socks5h://{server_address}'}) as rh:
@@ -394,7 +418,7 @@ class TestSocks5Proxy:
assert response['domain_address'] is None assert response['domain_address'] is None
assert response['version'] == 5 assert response['version'] == 5
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_socks5_ipv6_destination(self, handler, ctx): def test_socks5_ipv6_destination(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler) as server_address: with ctx.socks_server(Socks5ProxyHandler) as server_address:
with handler(proxies={'all': f'socks5://{server_address}'}) as rh: with handler(proxies={'all': f'socks5://{server_address}'}) as rh:
@@ -402,7 +426,7 @@ class TestSocks5Proxy:
assert response['ipv6_address'] == '::1' assert response['ipv6_address'] == '::1'
assert response['version'] == 5 assert response['version'] == 5
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_ipv6_socks5_proxy(self, handler, ctx): def test_ipv6_socks5_proxy(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler, bind_ip='::1') as server_address: with ctx.socks_server(Socks5ProxyHandler, bind_ip='::1') as server_address:
with handler(proxies={'all': f'socks5://{server_address}'}) as rh: with handler(proxies={'all': f'socks5://{server_address}'}) as rh:
@@ -413,7 +437,7 @@ class TestSocks5Proxy:
# XXX: is there any feasible way of testing IPv6 source addresses? # XXX: is there any feasible way of testing IPv6 source addresses?
# Same would go for non-proxy source_address test... # Same would go for non-proxy source_address test...
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
def test_ipv4_client_source_address(self, handler, ctx): def test_ipv4_client_source_address(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler) as server_address: with ctx.socks_server(Socks5ProxyHandler) as server_address:
source_address = f'127.0.0.{random.randint(5, 255)}' source_address = f'127.0.0.{random.randint(5, 255)}'
@@ -422,7 +446,7 @@ class TestSocks5Proxy:
assert response['client_address'][0] == source_address assert response['client_address'][0] == source_address
assert response['version'] == 5 assert response['version'] == 5
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Requests', 'http'), ('Websockets', 'ws')], indirect=True)
@pytest.mark.parametrize('reply_code', [ @pytest.mark.parametrize('reply_code', [
Socks5Reply.GENERAL_FAILURE, Socks5Reply.GENERAL_FAILURE,
Socks5Reply.CONNECTION_NOT_ALLOWED, Socks5Reply.CONNECTION_NOT_ALLOWED,
@@ -439,7 +463,7 @@ class TestSocks5Proxy:
with pytest.raises(ProxyError): with pytest.raises(ProxyError):
ctx.socks_info_request(rh) ctx.socks_info_request(rh)
@pytest.mark.parametrize('handler,ctx', [('Urllib', 'http')], indirect=True) @pytest.mark.parametrize('handler,ctx', [('Urllib', 'http'), ('Websockets', 'ws')], indirect=True)
def test_timeout(self, handler, ctx): def test_timeout(self, handler, ctx):
with ctx.socks_server(Socks5ProxyHandler, sleep=2) as server_address: with ctx.socks_server(Socks5ProxyHandler, sleep=2) as server_address:
with handler(proxies={'all': f'socks5://{server_address}'}, timeout=1) as rh: with handler(proxies={'all': f'socks5://{server_address}'}, timeout=1) as rh:

View File

@@ -9,7 +9,15 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import FakeYDL, report_warning from test.helper import FakeYDL, report_warning
from yt_dlp.update import Updater, UpdateInfo from yt_dlp.update import UpdateInfo, Updater
# XXX: Keep in sync with yt_dlp.update.UPDATE_SOURCES
TEST_UPDATE_SOURCES = {
'stable': 'yt-dlp/yt-dlp',
'nightly': 'yt-dlp/yt-dlp-nightly-builds',
'master': 'yt-dlp/yt-dlp-master-builds',
}
TEST_API_DATA = { TEST_API_DATA = {
'yt-dlp/yt-dlp/latest': { 'yt-dlp/yt-dlp/latest': {
@@ -68,25 +76,34 @@ TEST_API_DATA = {
}, },
} }
TEST_LOCKFILE_V1 = '''# This file is used for regulating self-update TEST_LOCKFILE_COMMENT = '# This file is used for regulating self-update'
lock 2022.08.18.36 .+ Python 3.6
lock 2023.11.13 .+ Python 3.7 TEST_LOCKFILE_V1 = r'''%s
lock 2022.08.18.36 .+ Python 3\.6
lock 2023.11.16 (?!win_x86_exe).+ Python 3\.7
lock 2023.11.16 win_x86_exe .+ Windows-(?:Vista|2008Server)
''' % TEST_LOCKFILE_COMMENT
TEST_LOCKFILE_V2_TMPL = r'''%s
lockV2 yt-dlp/yt-dlp 2022.08.18.36 .+ Python 3\.6
lockV2 yt-dlp/yt-dlp 2023.11.16 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp 2023.11.16 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp-nightly-builds 2023.11.15.232826 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp-nightly-builds 2023.11.15.232826 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp-master-builds 2023.11.15.232812 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp-master-builds 2023.11.15.232812 win_x86_exe .+ Windows-(?:Vista|2008Server)
''' '''
TEST_LOCKFILE_V2 = '''# This file is used for regulating self-update TEST_LOCKFILE_V2 = TEST_LOCKFILE_V2_TMPL % TEST_LOCKFILE_COMMENT
lockV2 yt-dlp/yt-dlp 2022.08.18.36 .+ Python 3.6
lockV2 yt-dlp/yt-dlp 2023.11.13 .+ Python 3.7
'''
TEST_LOCKFILE_V1_V2 = '''# This file is used for regulating self-update TEST_LOCKFILE_ACTUAL = TEST_LOCKFILE_V2_TMPL % TEST_LOCKFILE_V1.rstrip('\n')
lock 2022.08.18.36 .+ Python 3.6
lock 2023.11.13 .+ Python 3.7 TEST_LOCKFILE_FORK = r'''%s# Test if a fork blocks updates to non-numeric tags
lockV2 yt-dlp/yt-dlp 2022.08.18.36 .+ Python 3.6
lockV2 yt-dlp/yt-dlp 2023.11.13 .+ Python 3.7
lockV2 fork/yt-dlp pr0000 .+ Python 3.6 lockV2 fork/yt-dlp pr0000 .+ Python 3.6
lockV2 fork/yt-dlp pr1234 .+ Python 3.7 lockV2 fork/yt-dlp pr1234 (?!win_x86_exe).+ Python 3\.7
lockV2 fork/yt-dlp pr1234 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 fork/yt-dlp pr9999 .+ Python 3.11 lockV2 fork/yt-dlp pr9999 .+ Python 3.11
''' ''' % TEST_LOCKFILE_ACTUAL
class FakeUpdater(Updater): class FakeUpdater(Updater):
@@ -95,9 +112,10 @@ class FakeUpdater(Updater):
_channel = 'stable' _channel = 'stable'
_origin = 'yt-dlp/yt-dlp' _origin = 'yt-dlp/yt-dlp'
_update_sources = TEST_UPDATE_SOURCES
def _download_update_spec(self, *args, **kwargs): def _download_update_spec(self, *args, **kwargs):
return TEST_LOCKFILE_V1_V2 return TEST_LOCKFILE_ACTUAL
def _call_api(self, tag): def _call_api(self, tag):
tag = f'tags/{tag}' if tag != 'latest' else tag tag = f'tags/{tag}' if tag != 'latest' else tag
@@ -112,7 +130,7 @@ class TestUpdate(unittest.TestCase):
def test_update_spec(self): def test_update_spec(self):
ydl = FakeYDL() ydl = FakeYDL()
updater = FakeUpdater(ydl, 'stable@latest') updater = FakeUpdater(ydl, 'stable')
def test(lockfile, identifier, input_tag, expect_tag, exact=False, repo='yt-dlp/yt-dlp'): def test(lockfile, identifier, input_tag, expect_tag, exact=False, repo='yt-dlp/yt-dlp'):
updater._identifier = identifier updater._identifier = identifier
@@ -124,35 +142,46 @@ class TestUpdate(unittest.TestCase):
f'{identifier!r} requesting {repo}@{input_tag} (exact={exact}) ' f'{identifier!r} requesting {repo}@{input_tag} (exact={exact}) '
f'returned {result!r} instead of {expect_tag!r}') f'returned {result!r} instead of {expect_tag!r}')
test(TEST_LOCKFILE_V1, 'zip Python 3.11.0', '2023.11.13', '2023.11.13') for lockfile in (TEST_LOCKFILE_V1, TEST_LOCKFILE_V2, TEST_LOCKFILE_ACTUAL, TEST_LOCKFILE_FORK):
test(TEST_LOCKFILE_V1, 'zip stable Python 3.11.0', '2023.11.13', '2023.11.13', exact=True) # Normal operation
test(TEST_LOCKFILE_V1, 'zip Python 3.6.0', '2023.11.13', '2022.08.18.36') test(lockfile, 'zip Python 3.12.0', '2023.12.31', '2023.12.31')
test(TEST_LOCKFILE_V1, 'zip stable Python 3.6.0', '2023.11.13', None, exact=True) test(lockfile, 'zip stable Python 3.12.0', '2023.12.31', '2023.12.31', exact=True)
test(TEST_LOCKFILE_V1, 'zip Python 3.7.0', '2023.11.13', '2023.11.13') # Python 3.6 --update should update only to its lock
test(TEST_LOCKFILE_V1, 'zip stable Python 3.7.1', '2023.11.13', '2023.11.13') test(lockfile, 'zip Python 3.6.0', '2023.11.16', '2022.08.18.36')
test(TEST_LOCKFILE_V1, 'zip Python 3.7.1', '2023.12.31', '2023.11.13') # --update-to an exact version later than the lock should return None
test(TEST_LOCKFILE_V1, 'zip stable Python 3.7.1', '2023.12.31', '2023.11.13') test(lockfile, 'zip stable Python 3.6.0', '2023.11.16', None, exact=True)
# Python 3.7 should be able to update to its lock
test(lockfile, 'zip Python 3.7.0', '2023.11.16', '2023.11.16')
test(lockfile, 'zip stable Python 3.7.1', '2023.11.16', '2023.11.16', exact=True)
# Non-win_x86_exe builds on py3.7 must be locked
test(lockfile, 'zip Python 3.7.1', '2023.12.31', '2023.11.16')
test(lockfile, 'zip stable Python 3.7.1', '2023.12.31', None, exact=True)
test( # Windows Vista w/ win_x86_exe must be locked
lockfile, 'win_x86_exe stable Python 3.7.9 (CPython x86 32bit) - Windows-Vista-6.0.6003-SP2',
'2023.12.31', '2023.11.16')
test( # Windows 2008Server w/ win_x86_exe must be locked
lockfile, 'win_x86_exe Python 3.7.9 (CPython x86 32bit) - Windows-2008Server',
'2023.12.31', None, exact=True)
test( # Windows 7 w/ win_x86_exe py3.7 build should be able to update beyond lock
lockfile, 'win_x86_exe stable Python 3.7.9 (CPython x86 32bit) - Windows-7-6.1.7601-SP1',
'2023.12.31', '2023.12.31')
test( # Windows 8.1 w/ '2008Server' in platform string should be able to update beyond lock
lockfile, 'win_x86_exe Python 3.7.9 (CPython x86 32bit) - Windows-post2008Server-6.2.9200',
'2023.12.31', '2023.12.31', exact=True)
test(TEST_LOCKFILE_V2, 'zip Python 3.11.1', '2023.11.13', '2023.11.13') # Forks can block updates to non-numeric tags rather than lock
test(TEST_LOCKFILE_V2, 'zip stable Python 3.11.1', '2023.12.31', '2023.12.31') test(TEST_LOCKFILE_FORK, 'zip Python 3.6.3', 'pr0000', None, repo='fork/yt-dlp')
test(TEST_LOCKFILE_V2, 'zip Python 3.6.1', '2023.11.13', '2022.08.18.36') test(TEST_LOCKFILE_FORK, 'zip stable Python 3.7.4', 'pr0000', 'pr0000', repo='fork/yt-dlp')
test(TEST_LOCKFILE_V2, 'zip stable Python 3.7.2', '2023.11.13', '2023.11.13') test(TEST_LOCKFILE_FORK, 'zip stable Python 3.7.4', 'pr1234', None, repo='fork/yt-dlp')
test(TEST_LOCKFILE_V2, 'zip Python 3.7.2', '2023.12.31', '2023.11.13') test(TEST_LOCKFILE_FORK, 'zip Python 3.8.1', 'pr1234', 'pr1234', repo='fork/yt-dlp', exact=True)
test(
test(TEST_LOCKFILE_V1_V2, 'zip Python 3.11.2', '2023.11.13', '2023.11.13') TEST_LOCKFILE_FORK, 'win_x86_exe stable Python 3.7.9 (CPython x86 32bit) - Windows-Vista-6.0.6003-SP2',
test(TEST_LOCKFILE_V1_V2, 'zip stable Python 3.11.2', '2023.12.31', '2023.12.31') 'pr1234', None, repo='fork/yt-dlp')
test(TEST_LOCKFILE_V1_V2, 'zip Python 3.6.2', '2023.11.13', '2022.08.18.36') test(
test(TEST_LOCKFILE_V1_V2, 'zip stable Python 3.7.3', '2023.11.13', '2023.11.13') TEST_LOCKFILE_FORK, 'win_x86_exe stable Python 3.7.9 (CPython x86 32bit) - Windows-7-6.1.7601-SP1',
test(TEST_LOCKFILE_V1_V2, 'zip Python 3.7.3', '2023.12.31', '2023.11.13') '2023.12.31', '2023.12.31', repo='fork/yt-dlp')
test(TEST_LOCKFILE_V1_V2, 'zip Python 3.6.3', 'pr0000', None, repo='fork/yt-dlp') test(TEST_LOCKFILE_FORK, 'zip Python 3.11.2', 'pr9999', None, repo='fork/yt-dlp', exact=True)
test(TEST_LOCKFILE_V1_V2, 'zip stable Python 3.7.4', 'pr0000', 'pr0000', repo='fork/yt-dlp') test(TEST_LOCKFILE_FORK, 'zip stable Python 3.12.0', 'pr9999', 'pr9999', repo='fork/yt-dlp')
test(TEST_LOCKFILE_V1_V2, 'zip Python 3.6.4', 'pr0000', None, repo='fork/yt-dlp')
test(TEST_LOCKFILE_V1_V2, 'zip Python 3.7.4', 'pr1234', None, repo='fork/yt-dlp')
test(TEST_LOCKFILE_V1_V2, 'zip stable Python 3.8.1', 'pr1234', 'pr1234', repo='fork/yt-dlp')
test(TEST_LOCKFILE_V1_V2, 'zip Python 3.7.5', 'pr1234', None, repo='fork/yt-dlp')
test(TEST_LOCKFILE_V1_V2, 'zip Python 3.11.3', 'pr9999', None, repo='fork/yt-dlp')
test(TEST_LOCKFILE_V1_V2, 'zip stable Python 3.12.0', 'pr9999', 'pr9999', repo='fork/yt-dlp')
test(TEST_LOCKFILE_V1_V2, 'zip Python 3.11.4', 'pr9999', None, repo='fork/yt-dlp')
def test_query_update(self): def test_query_update(self):
ydl = FakeYDL() ydl = FakeYDL()

View File

@@ -2110,6 +2110,8 @@ Line 1
self.assertEqual(traverse_obj(_TEST_DATA, (..., {str_or_none})), self.assertEqual(traverse_obj(_TEST_DATA, (..., {str_or_none})),
[item for item in map(str_or_none, _TEST_DATA.values()) if item is not None], [item for item in map(str_or_none, _TEST_DATA.values()) if item is not None],
msg='Function in set should be a transformation') msg='Function in set should be a transformation')
self.assertEqual(traverse_obj(_TEST_DATA, ('fail', {lambda _: 'const'})), 'const',
msg='Function in set should always be called')
if __debug__: if __debug__:
with self.assertRaises(Exception, msg='Sets with length != 1 should raise in debug'): with self.assertRaises(Exception, msg='Sets with length != 1 should raise in debug'):
traverse_obj(_TEST_DATA, set()) traverse_obj(_TEST_DATA, set())
@@ -2317,23 +2319,6 @@ Line 1
self.assertEqual(traverse_obj({}, (0, slice(1)), traverse_string=True), [], self.assertEqual(traverse_obj({}, (0, slice(1)), traverse_string=True), [],
msg='branching should result in list if `traverse_string`') msg='branching should result in list if `traverse_string`')
# Test is_user_input behavior
_IS_USER_INPUT_DATA = {'range8': list(range(8))}
self.assertEqual(traverse_obj(_IS_USER_INPUT_DATA, ('range8', '3'),
is_user_input=True), 3,
msg='allow for string indexing if `is_user_input`')
self.assertCountEqual(traverse_obj(_IS_USER_INPUT_DATA, ('range8', '3:'),
is_user_input=True), tuple(range(8))[3:],
msg='allow for string slice if `is_user_input`')
self.assertCountEqual(traverse_obj(_IS_USER_INPUT_DATA, ('range8', ':4:2'),
is_user_input=True), tuple(range(8))[:4:2],
msg='allow step in string slice if `is_user_input`')
self.assertCountEqual(traverse_obj(_IS_USER_INPUT_DATA, ('range8', ':'),
is_user_input=True), range(8),
msg='`:` should be treated as `...` if `is_user_input`')
with self.assertRaises(TypeError, msg='too many params should result in error'):
traverse_obj(_IS_USER_INPUT_DATA, ('range8', ':::'), is_user_input=True)
# Test re.Match as input obj # Test re.Match as input obj
mobj = re.fullmatch(r'0(12)(?P<group>3)(4)?', '0123') mobj = re.fullmatch(r'0(12)(?P<group>3)(4)?', '0123')
self.assertEqual(traverse_obj(mobj, ...), [x for x in mobj.groups() if x is not None], self.assertEqual(traverse_obj(mobj, ...), [x for x in mobj.groups() if x is not None],
@@ -2387,6 +2372,11 @@ Line 1
headers4 = HTTPHeaderDict({'ytdl-test': 'data;'}) headers4 = HTTPHeaderDict({'ytdl-test': 'data;'})
self.assertEqual(set(headers4.items()), {('Ytdl-Test', 'data;')}) self.assertEqual(set(headers4.items()), {('Ytdl-Test', 'data;')})
# common mistake: strip whitespace from values
# https://github.com/yt-dlp/yt-dlp/issues/8729
headers5 = HTTPHeaderDict({'ytdl-test': ' data; '})
self.assertEqual(set(headers5.items()), {('Ytdl-Test', 'data;')})
def test_extract_basic_auth(self): def test_extract_basic_auth(self):
assert extract_basic_auth('http://:foo.bar') == ('http://:foo.bar', None) assert extract_basic_auth('http://:foo.bar') == ('http://:foo.bar', None)
assert extract_basic_auth('http://foo.bar') == ('http://foo.bar', None) assert extract_basic_auth('http://foo.bar') == ('http://foo.bar', None)

380
test/test_websockets.py Normal file
View File

@@ -0,0 +1,380 @@
#!/usr/bin/env python3
# Allow direct execution
import os
import sys
import pytest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import http.client
import http.cookiejar
import http.server
import json
import random
import ssl
import threading
from yt_dlp import socks
from yt_dlp.cookies import YoutubeDLCookieJar
from yt_dlp.dependencies import websockets
from yt_dlp.networking import Request
from yt_dlp.networking.exceptions import (
CertificateVerifyError,
HTTPError,
ProxyError,
RequestError,
SSLError,
TransportError,
)
from yt_dlp.utils.networking import HTTPHeaderDict
from test.conftest import validate_and_send
TEST_DIR = os.path.dirname(os.path.abspath(__file__))
def websocket_handler(websocket):
for message in websocket:
if isinstance(message, bytes):
if message == b'bytes':
return websocket.send('2')
elif isinstance(message, str):
if message == 'headers':
return websocket.send(json.dumps(dict(websocket.request.headers)))
elif message == 'path':
return websocket.send(websocket.request.path)
elif message == 'source_address':
return websocket.send(websocket.remote_address[0])
elif message == 'str':
return websocket.send('1')
return websocket.send(message)
def process_request(self, request):
if request.path.startswith('/gen_'):
status = http.HTTPStatus(int(request.path[5:]))
if 300 <= status.value <= 300:
return websockets.http11.Response(
status.value, status.phrase, websockets.datastructures.Headers([('Location', '/')]), b'')
return self.protocol.reject(status.value, status.phrase)
return self.protocol.accept(request)
def create_websocket_server(**ws_kwargs):
import websockets.sync.server
wsd = websockets.sync.server.serve(websocket_handler, '127.0.0.1', 0, process_request=process_request, **ws_kwargs)
ws_port = wsd.socket.getsockname()[1]
ws_server_thread = threading.Thread(target=wsd.serve_forever)
ws_server_thread.daemon = True
ws_server_thread.start()
return ws_server_thread, ws_port
def create_ws_websocket_server():
return create_websocket_server()
def create_wss_websocket_server():
certfn = os.path.join(TEST_DIR, 'testcert.pem')
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
sslctx.load_cert_chain(certfn, None)
return create_websocket_server(ssl_context=sslctx)
MTLS_CERT_DIR = os.path.join(TEST_DIR, 'testdata', 'certificate')
def create_mtls_wss_websocket_server():
certfn = os.path.join(TEST_DIR, 'testcert.pem')
cacertfn = os.path.join(MTLS_CERT_DIR, 'ca.crt')
sslctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
sslctx.verify_mode = ssl.CERT_REQUIRED
sslctx.load_verify_locations(cafile=cacertfn)
sslctx.load_cert_chain(certfn, None)
return create_websocket_server(ssl_context=sslctx)
@pytest.mark.skipif(not websockets, reason='websockets must be installed to test websocket request handlers')
class TestWebsSocketRequestHandlerConformance:
@classmethod
def setup_class(cls):
cls.ws_thread, cls.ws_port = create_ws_websocket_server()
cls.ws_base_url = f'ws://127.0.0.1:{cls.ws_port}'
cls.wss_thread, cls.wss_port = create_wss_websocket_server()
cls.wss_base_url = f'wss://127.0.0.1:{cls.wss_port}'
cls.bad_wss_thread, cls.bad_wss_port = create_websocket_server(ssl_context=ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER))
cls.bad_wss_host = f'wss://127.0.0.1:{cls.bad_wss_port}'
cls.mtls_wss_thread, cls.mtls_wss_port = create_mtls_wss_websocket_server()
cls.mtls_wss_base_url = f'wss://127.0.0.1:{cls.mtls_wss_port}'
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_basic_websockets(self, handler):
with handler() as rh:
ws = validate_and_send(rh, Request(self.ws_base_url))
assert 'upgrade' in ws.headers
assert ws.status == 101
ws.send('foo')
assert ws.recv() == 'foo'
ws.close()
# https://www.rfc-editor.org/rfc/rfc6455.html#section-5.6
@pytest.mark.parametrize('msg,opcode', [('str', 1), (b'bytes', 2)])
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_send_types(self, handler, msg, opcode):
with handler() as rh:
ws = validate_and_send(rh, Request(self.ws_base_url))
ws.send(msg)
assert int(ws.recv()) == opcode
ws.close()
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_verify_cert(self, handler):
with handler() as rh:
with pytest.raises(CertificateVerifyError):
validate_and_send(rh, Request(self.wss_base_url))
with handler(verify=False) as rh:
ws = validate_and_send(rh, Request(self.wss_base_url))
assert ws.status == 101
ws.close()
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_ssl_error(self, handler):
with handler(verify=False) as rh:
with pytest.raises(SSLError, match=r'ssl(?:v3|/tls) alert handshake failure') as exc_info:
validate_and_send(rh, Request(self.bad_wss_host))
assert not issubclass(exc_info.type, CertificateVerifyError)
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
@pytest.mark.parametrize('path,expected', [
# Unicode characters should be encoded with uppercase percent-encoding
('/中文', '/%E4%B8%AD%E6%96%87'),
# don't normalize existing percent encodings
('/%c7%9f', '/%c7%9f'),
])
def test_percent_encode(self, handler, path, expected):
with handler() as rh:
ws = validate_and_send(rh, Request(f'{self.ws_base_url}{path}'))
ws.send('path')
assert ws.recv() == expected
assert ws.status == 101
ws.close()
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_remove_dot_segments(self, handler):
with handler() as rh:
# This isn't a comprehensive test,
# but it should be enough to check whether the handler is removing dot segments
ws = validate_and_send(rh, Request(f'{self.ws_base_url}/a/b/./../../test'))
assert ws.status == 101
ws.send('path')
assert ws.recv() == '/test'
ws.close()
# We are restricted to known HTTP status codes in http.HTTPStatus
# Redirects are not supported for websockets
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
@pytest.mark.parametrize('status', (200, 204, 301, 302, 303, 400, 500, 511))
def test_raise_http_error(self, handler, status):
with handler() as rh:
with pytest.raises(HTTPError) as exc_info:
validate_and_send(rh, Request(f'{self.ws_base_url}/gen_{status}'))
assert exc_info.value.status == status
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
@pytest.mark.parametrize('params,extensions', [
({'timeout': 0.00001}, {}),
({}, {'timeout': 0.00001}),
])
def test_timeout(self, handler, params, extensions):
with handler(**params) as rh:
with pytest.raises(TransportError):
validate_and_send(rh, Request(self.ws_base_url, extensions=extensions))
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_cookies(self, handler):
cookiejar = YoutubeDLCookieJar()
cookiejar.set_cookie(http.cookiejar.Cookie(
version=0, name='test', value='ytdlp', port=None, port_specified=False,
domain='127.0.0.1', domain_specified=True, domain_initial_dot=False, path='/',
path_specified=True, secure=False, expires=None, discard=False, comment=None,
comment_url=None, rest={}))
with handler(cookiejar=cookiejar) as rh:
ws = validate_and_send(rh, Request(self.ws_base_url))
ws.send('headers')
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
ws.close()
with handler() as rh:
ws = validate_and_send(rh, Request(self.ws_base_url))
ws.send('headers')
assert 'cookie' not in json.loads(ws.recv())
ws.close()
ws = validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': cookiejar}))
ws.send('headers')
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
ws.close()
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_source_address(self, handler):
source_address = f'127.0.0.{random.randint(5, 255)}'
with handler(source_address=source_address) as rh:
ws = validate_and_send(rh, Request(self.ws_base_url))
ws.send('source_address')
assert source_address == ws.recv()
ws.close()
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_response_url(self, handler):
with handler() as rh:
url = f'{self.ws_base_url}/something'
ws = validate_and_send(rh, Request(url))
assert ws.url == url
ws.close()
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_request_headers(self, handler):
with handler(headers=HTTPHeaderDict({'test1': 'test', 'test2': 'test2'})) as rh:
# Global Headers
ws = validate_and_send(rh, Request(self.ws_base_url))
ws.send('headers')
headers = HTTPHeaderDict(json.loads(ws.recv()))
assert headers['test1'] == 'test'
ws.close()
# Per request headers, merged with global
ws = validate_and_send(rh, Request(
self.ws_base_url, headers={'test2': 'changed', 'test3': 'test3'}))
ws.send('headers')
headers = HTTPHeaderDict(json.loads(ws.recv()))
assert headers['test1'] == 'test'
assert headers['test2'] == 'changed'
assert headers['test3'] == 'test3'
ws.close()
@pytest.mark.parametrize('client_cert', (
{'client_certificate': os.path.join(MTLS_CERT_DIR, 'clientwithkey.crt')},
{
'client_certificate': os.path.join(MTLS_CERT_DIR, 'client.crt'),
'client_certificate_key': os.path.join(MTLS_CERT_DIR, 'client.key'),
},
{
'client_certificate': os.path.join(MTLS_CERT_DIR, 'clientwithencryptedkey.crt'),
'client_certificate_password': 'foobar',
},
{
'client_certificate': os.path.join(MTLS_CERT_DIR, 'client.crt'),
'client_certificate_key': os.path.join(MTLS_CERT_DIR, 'clientencrypted.key'),
'client_certificate_password': 'foobar',
}
))
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
def test_mtls(self, handler, client_cert):
with handler(
# Disable client-side validation of unacceptable self-signed testcert.pem
# The test is of a check on the server side, so unaffected
verify=False,
client_cert=client_cert
) as rh:
validate_and_send(rh, Request(self.mtls_wss_base_url)).close()
def create_fake_ws_connection(raised):
import websockets.sync.client
class FakeWsConnection(websockets.sync.client.ClientConnection):
def __init__(self, *args, **kwargs):
class FakeResponse:
body = b''
headers = {}
status_code = 101
reason_phrase = 'test'
self.response = FakeResponse()
def send(self, *args, **kwargs):
raise raised()
def recv(self, *args, **kwargs):
raise raised()
def close(self, *args, **kwargs):
return
return FakeWsConnection()
@pytest.mark.parametrize('handler', ['Websockets'], indirect=True)
class TestWebsocketsRequestHandler:
@pytest.mark.parametrize('raised,expected', [
# https://websockets.readthedocs.io/en/stable/reference/exceptions.html
(lambda: websockets.exceptions.InvalidURI(msg='test', uri='test://'), RequestError),
# Requires a response object. Should be covered by HTTP error tests.
# (lambda: websockets.exceptions.InvalidStatus(), TransportError),
(lambda: websockets.exceptions.InvalidHandshake(), TransportError),
# These are subclasses of InvalidHandshake
(lambda: websockets.exceptions.InvalidHeader(name='test'), TransportError),
(lambda: websockets.exceptions.NegotiationError(), TransportError),
# Catch-all
(lambda: websockets.exceptions.WebSocketException(), TransportError),
(lambda: TimeoutError(), TransportError),
# These may be raised by our create_connection implementation, which should also be caught
(lambda: OSError(), TransportError),
(lambda: ssl.SSLError(), SSLError),
(lambda: ssl.SSLCertVerificationError(), CertificateVerifyError),
(lambda: socks.ProxyError(), ProxyError),
])
def test_request_error_mapping(self, handler, monkeypatch, raised, expected):
import websockets.sync.client
import yt_dlp.networking._websockets
with handler() as rh:
def fake_connect(*args, **kwargs):
raise raised()
monkeypatch.setattr(yt_dlp.networking._websockets, 'create_connection', lambda *args, **kwargs: None)
monkeypatch.setattr(websockets.sync.client, 'connect', fake_connect)
with pytest.raises(expected) as exc_info:
rh.send(Request('ws://fake-url'))
assert exc_info.type is expected
@pytest.mark.parametrize('raised,expected,match', [
# https://websockets.readthedocs.io/en/stable/reference/sync/client.html#websockets.sync.client.ClientConnection.send
(lambda: websockets.exceptions.ConnectionClosed(None, None), TransportError, None),
(lambda: RuntimeError(), TransportError, None),
(lambda: TimeoutError(), TransportError, None),
(lambda: TypeError(), RequestError, None),
(lambda: socks.ProxyError(), ProxyError, None),
# Catch-all
(lambda: websockets.exceptions.WebSocketException(), TransportError, None),
])
def test_ws_send_error_mapping(self, handler, monkeypatch, raised, expected, match):
from yt_dlp.networking._websockets import WebsocketsResponseAdapter
ws = WebsocketsResponseAdapter(create_fake_ws_connection(raised), url='ws://fake-url')
with pytest.raises(expected, match=match) as exc_info:
ws.send('test')
assert exc_info.type is expected
@pytest.mark.parametrize('raised,expected,match', [
# https://websockets.readthedocs.io/en/stable/reference/sync/client.html#websockets.sync.client.ClientConnection.recv
(lambda: websockets.exceptions.ConnectionClosed(None, None), TransportError, None),
(lambda: RuntimeError(), TransportError, None),
(lambda: TimeoutError(), TransportError, None),
(lambda: socks.ProxyError(), ProxyError, None),
# Catch-all
(lambda: websockets.exceptions.WebSocketException(), TransportError, None),
])
def test_ws_recv_error_mapping(self, handler, monkeypatch, raised, expected, match):
from yt_dlp.networking._websockets import WebsocketsResponseAdapter
ws = WebsocketsResponseAdapter(create_fake_ws_connection(raised), url='ws://fake-url')
with pytest.raises(expected, match=match) as exc_info:
ws.recv()
assert exc_info.type is expected

View File

@@ -1 +1 @@
@py -bb -Werror -Xdev "%~dp0yt_dlp\__main__.py" %* @py -Werror -Xdev "%~dp0yt_dlp\__main__.py" %*

View File

@@ -1,2 +1,2 @@
#!/usr/bin/env sh #!/usr/bin/env sh
exec "${PYTHON:-python3}" -bb -Werror -Xdev "$(dirname "$(realpath "$0")")/yt_dlp/__main__.py" "$@" exec "${PYTHON:-python3}" -Werror -Xdev "$(dirname "$(realpath "$0")")/yt_dlp/__main__.py" "$@"

View File

@@ -60,7 +60,13 @@ from .postprocessor import (
get_postprocessor, get_postprocessor,
) )
from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping
from .update import REPOSITORY, _get_system_deprecation, _make_label, current_git_head, detect_variant from .update import (
REPOSITORY,
_get_system_deprecation,
_make_label,
current_git_head,
detect_variant,
)
from .utils import ( from .utils import (
DEFAULT_OUTTMPL, DEFAULT_OUTTMPL,
IDENTITY, IDENTITY,
@@ -625,13 +631,16 @@ class YoutubeDL:
'Overwriting params from "color" with "no_color"') 'Overwriting params from "color" with "no_color"')
self.params['color'] = 'no_color' self.params['color'] = 'no_color'
term_allow_color = os.environ.get('TERM', '').lower() != 'dumb' term_allow_color = os.getenv('TERM', '').lower() != 'dumb'
no_color = bool(os.getenv('NO_COLOR'))
def process_color_policy(stream): def process_color_policy(stream):
stream_name = {sys.stdout: 'stdout', sys.stderr: 'stderr'}[stream] stream_name = {sys.stdout: 'stdout', sys.stderr: 'stderr'}[stream]
policy = traverse_obj(self.params, ('color', (stream_name, None), {str}), get_all=False) policy = traverse_obj(self.params, ('color', (stream_name, None), {str}), get_all=False)
if policy in ('auto', None): if policy in ('auto', None):
return term_allow_color and supports_terminal_sequences(stream) if term_allow_color and supports_terminal_sequences(stream):
return 'no_color' if no_color else True
return False
assert policy in ('always', 'never', 'no_color'), policy assert policy in ('always', 'never', 'no_color'), policy
return {'always': True, 'never': False}.get(policy, policy) return {'always': True, 'never': False}.get(policy, policy)
@@ -1176,6 +1185,7 @@ class YoutubeDL:
MATH_FUNCTIONS = { MATH_FUNCTIONS = {
'+': float.__add__, '+': float.__add__,
'-': float.__sub__, '-': float.__sub__,
'*': float.__mul__,
} }
# Field is of the form key1.key2... # Field is of the form key1.key2...
# where keys (except first) can be string, int, slice or "{field, ...}" # where keys (except first) can be string, int, slice or "{field, ...}"
@@ -1197,6 +1207,15 @@ class YoutubeDL:
(?:\|(?P<default>.*?))? (?:\|(?P<default>.*?))?
)$''') )$''')
def _from_user_input(field):
if field == ':':
return ...
elif ':' in field:
return slice(*map(int_or_none, field.split(':')))
elif int_or_none(field) is not None:
return int(field)
return field
def _traverse_infodict(fields): def _traverse_infodict(fields):
fields = [f for x in re.split(r'\.({.+?})\.?', fields) fields = [f for x in re.split(r'\.({.+?})\.?', fields)
for f in ([x] if x.startswith('{') else x.split('.'))] for f in ([x] if x.startswith('{') else x.split('.'))]
@@ -1206,11 +1225,12 @@ class YoutubeDL:
for i, f in enumerate(fields): for i, f in enumerate(fields):
if not f.startswith('{'): if not f.startswith('{'):
fields[i] = _from_user_input(f)
continue continue
assert f.endswith('}'), f'No closing brace for {f} in {fields}' assert f.endswith('}'), f'No closing brace for {f} in {fields}'
fields[i] = {k: k.split('.') for k in f[1:-1].split(',')} fields[i] = {k: list(map(_from_user_input, k.split('.'))) for k in f[1:-1].split(',')}
return traverse_obj(info_dict, fields, is_user_input=True, traverse_string=True) return traverse_obj(info_dict, fields, traverse_string=True)
def get_value(mdict): def get_value(mdict):
# Object traversal # Object traversal
@@ -2451,9 +2471,16 @@ class YoutubeDL:
return selector_function(ctx_copy) return selector_function(ctx_copy)
return final_selector return final_selector
stream = io.BytesIO(format_spec.encode()) # HACK: Python 3.12 changed the underlying parser, rendering '7_a' invalid
# Prefix numbers with random letters to avoid it being classified as a number
# See: https://github.com/yt-dlp/yt-dlp/pulls/8797
# TODO: Implement parser not reliant on tokenize.tokenize
prefix = ''.join(random.choices(string.ascii_letters, k=32))
stream = io.BytesIO(re.sub(r'\d[_\d]*', rf'{prefix}\g<0>', format_spec).encode())
try: try:
tokens = list(_remove_unused_ops(tokenize.tokenize(stream.readline))) tokens = list(_remove_unused_ops(
token._replace(string=token.string.replace(prefix, ''))
for token in tokenize.tokenize(stream.readline)))
except tokenize.TokenError: except tokenize.TokenError:
raise syntax_error('Missing closing/opening brackets or parenthesis', (0, len(format_spec))) raise syntax_error('Missing closing/opening brackets or parenthesis', (0, len(format_spec)))
@@ -2586,6 +2613,9 @@ class YoutubeDL:
upload_date = datetime.datetime.fromtimestamp(info_dict[ts_key], datetime.timezone.utc) upload_date = datetime.datetime.fromtimestamp(info_dict[ts_key], datetime.timezone.utc)
info_dict[date_key] = upload_date.strftime('%Y%m%d') info_dict[date_key] = upload_date.strftime('%Y%m%d')
if not info_dict.get('release_year'):
info_dict['release_year'] = traverse_obj(info_dict, ('release_date', {lambda x: int(x[:4])}))
live_keys = ('is_live', 'was_live') live_keys = ('is_live', 'was_live')
live_status = info_dict.get('live_status') live_status = info_dict.get('live_status')
if live_status is None: if live_status is None:
@@ -4052,6 +4082,7 @@ class YoutubeDL:
return self._request_director.send(req) return self._request_director.send(req)
except NoSupportingHandlers as e: except NoSupportingHandlers as e:
for ue in e.unsupported_errors: for ue in e.unsupported_errors:
# FIXME: This depends on the order of errors.
if not (ue.handler and ue.msg): if not (ue.handler and ue.msg):
continue continue
if ue.handler.RH_KEY == 'Urllib' and 'unsupported url scheme: "file"' in ue.msg.lower(): if ue.handler.RH_KEY == 'Urllib' and 'unsupported url scheme: "file"' in ue.msg.lower():
@@ -4061,6 +4092,15 @@ class YoutubeDL:
if 'unsupported proxy type: "https"' in ue.msg.lower(): if 'unsupported proxy type: "https"' in ue.msg.lower():
raise RequestError( raise RequestError(
'To use an HTTPS proxy for this request, one of the following dependencies needs to be installed: requests') 'To use an HTTPS proxy for this request, one of the following dependencies needs to be installed: requests')
elif (
re.match(r'unsupported url scheme: "wss?"', ue.msg.lower())
and 'websockets' not in self._request_director.handlers
):
raise RequestError(
'This request requires WebSocket support. '
'Ensure one of the following dependencies are installed: websockets',
cause=ue) from ue
raise raise
except SSLError as e: except SSLError as e:
if 'UNSAFE_LEGACY_RENEGOTIATION_DISABLED' in str(e): if 'UNSAFE_LEGACY_RENEGOTIATION_DISABLED' in str(e):

View File

@@ -1,8 +1,8 @@
try: import sys
import contextvars # noqa: F401
except Exception: if sys.version_info < (3, 8):
raise Exception( raise ImportError(
f'You are using an unsupported version of Python. Only Python versions 3.7 and above are supported by yt-dlp') # noqa: F541 f'You are using an unsupported version of Python. Only Python versions 3.8 and above are supported by yt-dlp') # noqa: F541
__license__ = 'Public Domain' __license__ = 'Public Domain'
@@ -12,7 +12,6 @@ import itertools
import optparse import optparse
import os import os
import re import re
import sys
import traceback import traceback
from .compat import compat_shlex_quote from .compat import compat_shlex_quote
@@ -74,14 +73,16 @@ def _exit(status=0, *args):
def get_urls(urls, batchfile, verbose): def get_urls(urls, batchfile, verbose):
# Batch file verification """
@param verbose -1: quiet, 0: normal, 1: verbose
"""
batch_urls = [] batch_urls = []
if batchfile is not None: if batchfile is not None:
try: try:
batch_urls = read_batch_urls( batch_urls = read_batch_urls(
read_stdin('URLs') if batchfile == '-' read_stdin(None if verbose == -1 else 'URLs') if batchfile == '-'
else open(expand_path(batchfile), encoding='utf-8', errors='ignore')) else open(expand_path(batchfile), encoding='utf-8', errors='ignore'))
if verbose: if verbose == 1:
write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n') write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n')
except OSError: except OSError:
_exit(f'ERROR: batch file {batchfile} could not be read') _exit(f'ERROR: batch file {batchfile} could not be read')
@@ -722,7 +723,7 @@ ParsedOptions = collections.namedtuple('ParsedOptions', ('parser', 'options', 'u
def parse_options(argv=None): def parse_options(argv=None):
"""@returns ParsedOptions(parser, opts, urls, ydl_opts)""" """@returns ParsedOptions(parser, opts, urls, ydl_opts)"""
parser, opts, urls = parseOpts(argv) parser, opts, urls = parseOpts(argv)
urls = get_urls(urls, opts.batchfile, opts.verbose) urls = get_urls(urls, opts.batchfile, -1 if opts.quiet and not opts.verbose else opts.verbose)
set_compat_opts(opts) set_compat_opts(opts)
try: try:

View File

@@ -10,17 +10,3 @@ try:
cache # >= 3.9 cache # >= 3.9
except NameError: except NameError:
cache = lru_cache(maxsize=None) cache = lru_cache(maxsize=None)
try:
cached_property # >= 3.8
except NameError:
class cached_property:
def __init__(self, func):
update_wrapper(self, func)
self.func = func
def __get__(self, instance, _):
if instance is None:
return self
setattr(instance, self.func.__name__, self.func(instance))
return getattr(instance, self.func.__name__)

View File

@@ -6,7 +6,7 @@ from . import get_suitable_downloader
from .common import FileDownloader from .common import FileDownloader
from .external import FFmpegFD from .external import FFmpegFD
from ..networking import Request from ..networking import Request
from ..utils import DownloadError, WebSocketsWrapper, str_or_none, try_get from ..utils import DownloadError, str_or_none, try_get
class NiconicoDmcFD(FileDownloader): class NiconicoDmcFD(FileDownloader):
@@ -64,7 +64,6 @@ class NiconicoLiveFD(FileDownloader):
ws_url = info_dict['url'] ws_url = info_dict['url']
ws_extractor = info_dict['ws'] ws_extractor = info_dict['ws']
ws_origin_host = info_dict['origin'] ws_origin_host = info_dict['origin']
cookies = info_dict.get('cookies')
live_quality = info_dict.get('live_quality', 'high') live_quality = info_dict.get('live_quality', 'high')
live_latency = info_dict.get('live_latency', 'high') live_latency = info_dict.get('live_latency', 'high')
dl = FFmpegFD(self.ydl, self.params or {}) dl = FFmpegFD(self.ydl, self.params or {})
@@ -76,12 +75,7 @@ class NiconicoLiveFD(FileDownloader):
def communicate_ws(reconnect): def communicate_ws(reconnect):
if reconnect: if reconnect:
ws = WebSocketsWrapper(ws_url, { ws = self.ydl.urlopen(Request(ws_url, headers={'Origin': f'https://{ws_origin_host}'}))
'Cookies': str_or_none(cookies) or '',
'Origin': f'https://{ws_origin_host}',
'Accept': '*/*',
'User-Agent': self.params['http_headers']['User-Agent'],
})
if self.ydl.params.get('verbose', False): if self.ydl.params.get('verbose', False):
self.to_screen('[debug] Sending startWatching request') self.to_screen('[debug] Sending startWatching request')
ws.send(json.dumps({ ws.send(json.dumps({

View File

@@ -77,16 +77,23 @@ from .agora import (
WyborczaPodcastIE, WyborczaPodcastIE,
WyborczaVideoIE, WyborczaVideoIE,
) )
from .airmozilla import AirMozillaIE
from .airtv import AirTVIE from .airtv import AirTVIE
from .aitube import AitubeKZVideoIE from .aitube import AitubeKZVideoIE
from .aljazeera import AlJazeeraIE from .aljazeera import AlJazeeraIE
from .allstar import (
AllstarIE,
AllstarProfileIE,
)
from .alphaporno import AlphaPornoIE from .alphaporno import AlphaPornoIE
from .amara import AmaraIE from .altcensored import (
AltCensoredIE,
AltCensoredChannelIE,
)
from .alura import ( from .alura import (
AluraIE, AluraIE,
AluraCourseIE AluraCourseIE
) )
from .amara import AmaraIE
from .amcnetworks import AMCNetworksIE from .amcnetworks import AMCNetworksIE
from .amazon import ( from .amazon import (
AmazonStoreIE, AmazonStoreIE,
@@ -127,8 +134,8 @@ from .arcpublishing import ArcPublishingIE
from .arkena import ArkenaIE from .arkena import ArkenaIE
from .ard import ( from .ard import (
ARDBetaMediathekIE, ARDBetaMediathekIE,
ARDMediathekCollectionIE,
ARDIE, ARDIE,
ARDMediathekIE,
) )
from .arte import ( from .arte import (
ArteTVIE, ArteTVIE,
@@ -139,7 +146,6 @@ from .arte import (
from .arnes import ArnesIE from .arnes import ArnesIE
from .atresplayer import AtresPlayerIE from .atresplayer import AtresPlayerIE
from .atscaleconf import AtScaleConfEventIE from .atscaleconf import AtScaleConfEventIE
from .atttechchannel import ATTTechChannelIE
from .atvat import ATVAtIE from .atvat import ATVAtIE
from .audimedia import AudiMediaIE from .audimedia import AudiMediaIE
from .audioboom import AudioBoomIE from .audioboom import AudioBoomIE
@@ -212,6 +218,8 @@ from .bilibili import (
BiliBiliBangumiIE, BiliBiliBangumiIE,
BiliBiliBangumiSeasonIE, BiliBiliBangumiSeasonIE,
BiliBiliBangumiMediaIE, BiliBiliBangumiMediaIE,
BilibiliCheeseIE,
BilibiliCheeseSeasonIE,
BiliBiliSearchIE, BiliBiliSearchIE,
BilibiliCategoryIE, BilibiliCategoryIE,
BilibiliAudioIE, BilibiliAudioIE,
@@ -233,11 +241,6 @@ from .bitchute import (
BitChuteIE, BitChuteIE,
BitChuteChannelIE, BitChuteChannelIE,
) )
from .bitwave import (
BitwaveReplayIE,
BitwaveStreamIE,
)
from .biqle import BIQLEIE
from .blackboardcollaborate import BlackboardCollaborateIE from .blackboardcollaborate import BlackboardCollaborateIE
from .bleacherreport import ( from .bleacherreport import (
BleacherReportIE, BleacherReportIE,
@@ -252,10 +255,7 @@ from .bostonglobe import BostonGlobeIE
from .box import BoxIE from .box import BoxIE
from .boxcast import BoxCastVideoIE from .boxcast import BoxCastVideoIE
from .bpb import BpbIE from .bpb import BpbIE
from .br import ( from .br import BRIE
BRIE,
BRMediathekIE,
)
from .bravotv import BravoTVIE from .bravotv import BravoTVIE
from .brainpop import ( from .brainpop import (
BrainPOPIE, BrainPOPIE,
@@ -265,7 +265,6 @@ from .brainpop import (
BrainPOPFrIE, BrainPOPFrIE,
BrainPOPIlIE, BrainPOPIlIE,
) )
from .breakcom import BreakIE
from .breitbart import BreitBartIE from .breitbart import BreitBartIE
from .brightcove import ( from .brightcove import (
BrightcoveLegacyIE, BrightcoveLegacyIE,
@@ -277,6 +276,7 @@ from .brilliantpala import (
) )
from .businessinsider import BusinessInsiderIE from .businessinsider import BusinessInsiderIE
from .bundesliga import BundesligaIE from .bundesliga import BundesligaIE
from .bundestag import BundestagIE
from .buzzfeed import BuzzFeedIE from .buzzfeed import BuzzFeedIE
from .byutv import BYUtvIE from .byutv import BYUtvIE
from .c56 import C56IE from .c56 import C56IE
@@ -295,16 +295,11 @@ from .camfm import (
from .cammodels import CamModelsIE from .cammodels import CamModelsIE
from .camsoda import CamsodaIE from .camsoda import CamsodaIE
from .camtasia import CamtasiaEmbedIE from .camtasia import CamtasiaEmbedIE
from .camwithher import CamWithHerIE
from .canal1 import Canal1IE from .canal1 import Canal1IE
from .canalalpha import CanalAlphaIE from .canalalpha import CanalAlphaIE
from .canalplus import CanalplusIE from .canalplus import CanalplusIE
from .canalc2 import Canalc2IE from .canalc2 import Canalc2IE
from .caracoltv import CaracolTvPlayIE from .caracoltv import CaracolTvPlayIE
from .carambatv import (
CarambaTVIE,
CarambaTVPageIE,
)
from .cartoonnetwork import CartoonNetworkIE from .cartoonnetwork import CartoonNetworkIE
from .cbc import ( from .cbc import (
CBCIE, CBCIE,
@@ -343,7 +338,6 @@ from .cda import CDAIE
from .cellebrite import CellebriteIE from .cellebrite import CellebriteIE
from .ceskatelevize import CeskaTelevizeIE from .ceskatelevize import CeskaTelevizeIE
from .cgtn import CGTNIE from .cgtn import CGTNIE
from .channel9 import Channel9IE
from .charlierose import CharlieRoseIE from .charlierose import CharlieRoseIE
from .chaturbate import ChaturbateIE from .chaturbate import ChaturbateIE
from .chilloutzone import ChilloutzoneIE from .chilloutzone import ChilloutzoneIE
@@ -351,11 +345,6 @@ from .chingari import (
ChingariIE, ChingariIE,
ChingariUserIE, ChingariUserIE,
) )
from .chirbit import (
ChirbitIE,
ChirbitProfileIE,
)
from .cinchcast import CinchcastIE
from .cinemax import CinemaxIE from .cinemax import CinemaxIE
from .cinetecamilano import CinetecaMilanoIE from .cinetecamilano import CinetecaMilanoIE
from .cineverse import ( from .cineverse import (
@@ -372,10 +361,8 @@ from .clipchamp import ClipchampIE
from .cliphunter import CliphunterIE from .cliphunter import CliphunterIE
from .clippit import ClippitIE from .clippit import ClippitIE
from .cliprs import ClipRsIE from .cliprs import ClipRsIE
from .clipsyndicate import ClipsyndicateIE
from .closertotruth import CloserToTruthIE from .closertotruth import CloserToTruthIE
from .cloudflarestream import CloudflareStreamIE from .cloudflarestream import CloudflareStreamIE
from .cloudy import CloudyIE
from .clubic import ClubicIE from .clubic import ClubicIE
from .clyp import ClypIE from .clyp import ClypIE
from .cmt import CMTIE from .cmt import CMTIE
@@ -442,7 +429,6 @@ from .dacast import (
DacastVODIE, DacastVODIE,
DacastPlaylistIE, DacastPlaylistIE,
) )
from .daftsex import DaftsexIE
from .dailymail import DailyMailIE from .dailymail import DailyMailIE
from .dailymotion import ( from .dailymotion import (
DailymotionIE, DailymotionIE,
@@ -479,7 +465,6 @@ from .dlf import (
from .dfb import DFBIE from .dfb import DFBIE
from .dhm import DHMIE from .dhm import DHMIE
from .digg import DiggIE from .digg import DiggIE
from .dotsub import DotsubIE
from .douyutv import ( from .douyutv import (
DouyuShowIE, DouyuShowIE,
DouyuTVIE, DouyuTVIE,
@@ -526,7 +511,6 @@ from .duboku import (
DubokuPlaylistIE DubokuPlaylistIE
) )
from .dumpert import DumpertIE from .dumpert import DumpertIE
from .defense import DefenseGouvFrIE
from .deuxm import ( from .deuxm import (
DeuxMIE, DeuxMIE,
DeuxMNewsIE DeuxMNewsIE
@@ -541,6 +525,7 @@ from .dropout import (
DropoutSeasonIE, DropoutSeasonIE,
DropoutIE DropoutIE
) )
from .duoplay import DuoplayIE
from .dw import ( from .dw import (
DWIE, DWIE,
DWArticleIE, DWArticleIE,
@@ -548,30 +533,22 @@ from .dw import (
from .eagleplatform import EaglePlatformIE, ClipYouEmbedIE from .eagleplatform import EaglePlatformIE, ClipYouEmbedIE
from .ebaumsworld import EbaumsWorldIE from .ebaumsworld import EbaumsWorldIE
from .ebay import EbayIE from .ebay import EbayIE
from .echomsk import EchoMskIE
from .egghead import ( from .egghead import (
EggheadCourseIE, EggheadCourseIE,
EggheadLessonIE, EggheadLessonIE,
) )
from .ehow import EHowIE
from .eighttracks import EightTracksIE from .eighttracks import EightTracksIE
from .einthusan import EinthusanIE from .einthusan import EinthusanIE
from .eitb import EitbIE from .eitb import EitbIE
from .elevensports import ElevenSportsIE
from .ellentube import (
EllenTubeIE,
EllenTubeVideoIE,
EllenTubePlaylistIE,
)
from .elonet import ElonetIE from .elonet import ElonetIE
from .elpais import ElPaisIE from .elpais import ElPaisIE
from .eltrecetv import ElTreceTVIE from .eltrecetv import ElTreceTVIE
from .embedly import EmbedlyIE from .embedly import EmbedlyIE
from .engadget import EngadgetIE
from .epicon import ( from .epicon import (
EpiconIE, EpiconIE,
EpiconSeriesIE, EpiconSeriesIE,
) )
from .epidemicsound import EpidemicSoundIE
from .eplus import EplusIbIE from .eplus import EplusIbIE
from .epoch import EpochIE from .epoch import EpochIE
from .eporner import EpornerIE from .eporner import EpornerIE
@@ -585,7 +562,6 @@ from .ertgr import (
ERTFlixIE, ERTFlixIE,
ERTWebtvEmbedIE, ERTWebtvEmbedIE,
) )
from .escapist import EscapistIE
from .espn import ( from .espn import (
ESPNIE, ESPNIE,
WatchESPNIE, WatchESPNIE,
@@ -593,15 +569,12 @@ from .espn import (
FiveThirtyEightIE, FiveThirtyEightIE,
ESPNCricInfoIE, ESPNCricInfoIE,
) )
from .esri import EsriVideoIE
from .ettutv import EttuTvIE from .ettutv import EttuTvIE
from .europa import EuropaIE, EuroParlWebstreamIE from .europa import EuropaIE, EuroParlWebstreamIE
from .europeantour import EuropeanTourIE from .europeantour import EuropeanTourIE
from .eurosport import EurosportIE from .eurosport import EurosportIE
from .euscreen import EUScreenIE from .euscreen import EUScreenIE
from .expotv import ExpoTVIE
from .expressen import ExpressenIE from .expressen import ExpressenIE
from .extremetube import ExtremeTubeIE
from .eyedotv import EyedoTVIE from .eyedotv import EyedoTVIE
from .facebook import ( from .facebook import (
FacebookIE, FacebookIE,
@@ -631,6 +604,10 @@ from .filmweb import FilmwebIE
from .firsttv import FirstTVIE from .firsttv import FirstTVIE
from .fivetv import FiveTVIE from .fivetv import FiveTVIE
from .flickr import FlickrIE from .flickr import FlickrIE
from .floatplane import (
FloatplaneIE,
FloatplaneChannelIE,
)
from .folketinget import FolketingetIE from .folketinget import FolketingetIE
from .footyroom import FootyRoomIE from .footyroom import FootyRoomIE
from .formula1 import Formula1IE from .formula1 import Formula1IE
@@ -640,16 +617,11 @@ from .fourtube import (
PornerBrosIE, PornerBrosIE,
FuxIE, FuxIE,
) )
from .fourzerostudio import (
FourZeroStudioArchiveIE,
FourZeroStudioClipIE,
)
from .fox import FOXIE from .fox import FOXIE
from .fox9 import ( from .fox9 import (
FOX9IE, FOX9IE,
FOX9NewsIE, FOX9NewsIE,
) )
from .foxgay import FoxgayIE
from .foxnews import ( from .foxnews import (
FoxNewsIE, FoxNewsIE,
FoxNewsArticleIE, FoxNewsArticleIE,
@@ -682,7 +654,6 @@ from .funimation import (
) )
from .funk import FunkIE from .funk import FunkIE
from .funker530 import Funker530IE from .funker530 import Funker530IE
from .fusion import FusionIE
from .fuyintv import FuyinTVIE from .fuyintv import FuyinTVIE
from .gab import ( from .gab import (
GabTVIE, GabTVIE,
@@ -713,7 +684,6 @@ from .gettr import (
GettrIE, GettrIE,
GettrStreamingIE, GettrStreamingIE,
) )
from .gfycat import GfycatIE
from .giantbomb import GiantBombIE from .giantbomb import GiantBombIE
from .giga import GigaIE from .giga import GigaIE
from .glide import GlideIE from .glide import GlideIE
@@ -759,12 +729,10 @@ from .hbo import HBOIE
from .hearthisat import HearThisAtIE from .hearthisat import HearThisAtIE
from .heise import HeiseIE from .heise import HeiseIE
from .hellporno import HellPornoIE from .hellporno import HellPornoIE
from .helsinki import HelsinkiIE
from .hgtv import HGTVComShowIE from .hgtv import HGTVComShowIE
from .hketv import HKETVIE from .hketv import HKETVIE
from .hidive import HiDiveIE from .hidive import HiDiveIE
from .historicfilms import HistoricFilmsIE from .historicfilms import HistoricFilmsIE
from .hitbox import HitboxIE, HitboxLiveIE
from .hitrecord import HitRecordIE from .hitrecord import HitRecordIE
from .hollywoodreporter import ( from .hollywoodreporter import (
HollywoodReporterIE, HollywoodReporterIE,
@@ -779,8 +747,6 @@ from .hotstar import (
HotStarSeasonIE, HotStarSeasonIE,
HotStarSeriesIE, HotStarSeriesIE,
) )
from .howcast import HowcastIE
from .howstuffworks import HowStuffWorksIE
from .hrefli import HrefLiRedirectIE from .hrefli import HrefLiRedirectIE
from .hrfensehen import HRFernsehenIE from .hrfensehen import HRFernsehenIE
from .hrti import ( from .hrti import (
@@ -900,6 +866,7 @@ from .jiosaavn import (
) )
from .jove import JoveIE from .jove import JoveIE
from .joj import JojIE from .joj import JojIE
from .joqrag import JoqrAgIE
from .jstream import JStreamIE from .jstream import JStreamIE
from .jtbc import ( from .jtbc import (
JTBCIE, JTBCIE,
@@ -912,7 +879,6 @@ from .kanal2 import Kanal2IE
from .kankanews import KankaNewsIE from .kankanews import KankaNewsIE
from .karaoketv import KaraoketvIE from .karaoketv import KaraoketvIE
from .karrierevideos import KarriereVideosIE from .karrierevideos import KarriereVideosIE
from .keezmovies import KeezMoviesIE
from .kelbyone import KelbyOneIE from .kelbyone import KelbyOneIE
from .khanacademy import ( from .khanacademy import (
KhanAcademyIE, KhanAcademyIE,
@@ -947,12 +913,6 @@ from .la7 import (
LA7PodcastEpisodeIE, LA7PodcastEpisodeIE,
LA7PodcastIE, LA7PodcastIE,
) )
from .laola1tv import (
Laola1TvEmbedIE,
Laola1TvIE,
EHFTVIE,
ITTFIE,
)
from .lastfm import ( from .lastfm import (
LastFMIE, LastFMIE,
LastFMPlaylistIE, LastFMPlaylistIE,
@@ -1007,7 +967,6 @@ from .linkedin import (
LinkedInLearningIE, LinkedInLearningIE,
LinkedInLearningCourseIE, LinkedInLearningCourseIE,
) )
from .linuxacademy import LinuxAcademyIE
from .liputan6 import Liputan6IE from .liputan6 import Liputan6IE
from .listennotes import ListenNotesIE from .listennotes import ListenNotesIE
from .litv import LiTVIE from .litv import LiTVIE
@@ -1035,7 +994,7 @@ from .lynda import (
LyndaIE, LyndaIE,
LyndaCourseIE LyndaCourseIE
) )
from .m6 import M6IE from .maariv import MaarivIE
from .magellantv import MagellanTVIE from .magellantv import MagellanTVIE
from .magentamusik360 import MagentaMusik360IE from .magentamusik360 import MagentaMusik360IE
from .mailru import ( from .mailru import (
@@ -1086,10 +1045,7 @@ from .medici import MediciIE
from .megaphone import MegaphoneIE from .megaphone import MegaphoneIE
from .meipai import MeipaiIE from .meipai import MeipaiIE
from .melonvod import MelonVODIE from .melonvod import MelonVODIE
from .meta import METAIE
from .metacafe import MetacafeIE
from .metacritic import MetacriticIE from .metacritic import MetacriticIE
from .mgoon import MgoonIE
from .mgtv import MGTVIE from .mgtv import MGTVIE
from .miaopai import MiaoPaiIE from .miaopai import MiaoPaiIE
from .microsoftstream import MicrosoftStreamIE from .microsoftstream import MicrosoftStreamIE
@@ -1111,7 +1067,6 @@ from .minds import (
) )
from .ministrygrid import MinistryGridIE from .ministrygrid import MinistryGridIE
from .minoto import MinotoIE from .minoto import MinotoIE
from .miomio import MioMioIE
from .mirrativ import ( from .mirrativ import (
MirrativIE, MirrativIE,
MirrativUserIE, MirrativUserIE,
@@ -1135,13 +1090,7 @@ from .mlb import (
MLBArticleIE, MLBArticleIE,
) )
from .mlssoccer import MLSSoccerIE from .mlssoccer import MLSSoccerIE
from .mnet import MnetIE
from .mocha import MochaVideoIE from .mocha import MochaVideoIE
from .moevideo import MoeVideoIE
from .mofosex import (
MofosexIE,
MofosexEmbedIE,
)
from .mojvideo import MojvideoIE from .mojvideo import MojvideoIE
from .monstercat import MonstercatIE from .monstercat import MonstercatIE
from .morningstar import MorningstarIE from .morningstar import MorningstarIE
@@ -1151,7 +1100,6 @@ from .motherless import (
MotherlessGalleryIE, MotherlessGalleryIE,
) )
from .motorsport import MotorsportIE from .motorsport import MotorsportIE
from .movieclips import MovieClipsIE
from .moviepilot import MoviepilotIE from .moviepilot import MoviepilotIE
from .moview import MoviewPlayIE from .moview import MoviewPlayIE
from .moviezine import MoviezineIE from .moviezine import MoviezineIE
@@ -1176,18 +1124,12 @@ from .musicdex import (
MusicdexArtistIE, MusicdexArtistIE,
MusicdexPlaylistIE, MusicdexPlaylistIE,
) )
from .mwave import MwaveIE, MwaveMeetGreetIE
from .mxplayer import ( from .mxplayer import (
MxplayerIE, MxplayerIE,
MxplayerShowIE, MxplayerShowIE,
) )
from .mychannels import MyChannelsIE
from .myspace import MySpaceIE, MySpaceAlbumIE from .myspace import MySpaceIE, MySpaceAlbumIE
from .myspass import MySpassIE from .myspass import MySpassIE
from .myvi import (
MyviIE,
MyviEmbedIE,
)
from .myvideoge import MyVideoGeIE from .myvideoge import MyVideoGeIE
from .myvidster import MyVidsterIE from .myvidster import MyVidsterIE
from .mzaalo import MzaaloIE from .mzaalo import MzaaloIE
@@ -1236,6 +1178,7 @@ from .ndr import (
from .ndtv import NDTVIE from .ndtv import NDTVIE
from .nebula import ( from .nebula import (
NebulaIE, NebulaIE,
NebulaClassIE,
NebulaSubscriptionsIE, NebulaSubscriptionsIE,
NebulaChannelIE, NebulaChannelIE,
) )
@@ -1262,7 +1205,6 @@ from .newgrounds import (
NewgroundsUserIE, NewgroundsUserIE,
) )
from .newspicks import NewsPicksIE from .newspicks import NewsPicksIE
from .newstube import NewstubeIE
from .newsy import NewsyIE from .newsy import NewsyIE
from .nextmedia import ( from .nextmedia import (
NextMediaIE, NextMediaIE,
@@ -1297,7 +1239,6 @@ from .nick import (
NickIE, NickIE,
NickBrIE, NickBrIE,
NickDeIE, NickDeIE,
NickNightIE,
NickRuIE, NickRuIE,
) )
from .niconico import ( from .niconico import (
@@ -1330,8 +1271,6 @@ from .noice import NoicePodcastIE
from .nonktube import NonkTubeIE from .nonktube import NonkTubeIE
from .noodlemagazine import NoodleMagazineIE from .noodlemagazine import NoodleMagazineIE
from .noovo import NoovoIE from .noovo import NoovoIE
from .normalboots import NormalbootsIE
from .nosvideo import NosVideoIE
from .nosnl import NOSNLArticleIE from .nosnl import NOSNLArticleIE
from .nova import ( from .nova import (
NovaEmbedIE, NovaEmbedIE,
@@ -1406,10 +1345,6 @@ from .onet import (
OnetPlIE, OnetPlIE,
) )
from .onionstudios import OnionStudiosIE from .onionstudios import OnionStudiosIE
from .ooyala import (
OoyalaIE,
OoyalaExternalIE,
)
from .opencast import ( from .opencast import (
OpencastIE, OpencastIE,
OpencastPlaylistIE, OpencastPlaylistIE,
@@ -1438,7 +1373,6 @@ from .palcomp3 import (
PalcoMP3ArtistIE, PalcoMP3ArtistIE,
PalcoMP3VideoIE, PalcoMP3VideoIE,
) )
from .pandoratv import PandoraTVIE
from .panopto import ( from .panopto import (
PanoptoIE, PanoptoIE,
PanoptoListIE, PanoptoListIE,
@@ -1466,7 +1400,6 @@ from .peloton import (
PelotonIE, PelotonIE,
PelotonLiveIE PelotonLiveIE
) )
from .people import PeopleIE
from .performgroup import PerformGroupIE from .performgroup import PerformGroupIE
from .periscope import ( from .periscope import (
PeriscopeIE, PeriscopeIE,
@@ -1498,13 +1431,10 @@ from .platzi import (
PlatziIE, PlatziIE,
PlatziCourseIE, PlatziCourseIE,
) )
from .playfm import PlayFMIE
from .playplustv import PlayPlusTVIE from .playplustv import PlayPlusTVIE
from .plays import PlaysTVIE
from .playstuff import PlayStuffIE from .playstuff import PlayStuffIE
from .playsuisse import PlaySuisseIE from .playsuisse import PlaySuisseIE
from .playtvak import PlaytvakIE from .playtvak import PlaytvakIE
from .playvid import PlayvidIE
from .playwire import PlaywireIE from .playwire import PlaywireIE
from .plutotv import PlutoTVIE from .plutotv import PlutoTVIE
from .pluralsight import ( from .pluralsight import (
@@ -1536,9 +1466,7 @@ from .popcorntimes import PopcorntimesIE
from .popcorntv import PopcornTVIE from .popcorntv import PopcornTVIE
from .porn91 import Porn91IE from .porn91 import Porn91IE
from .pornbox import PornboxIE from .pornbox import PornboxIE
from .porncom import PornComIE
from .pornflip import PornFlipIE from .pornflip import PornFlipIE
from .pornhd import PornHdIE
from .pornhub import ( from .pornhub import (
PornHubIE, PornHubIE,
PornHubUserIE, PornHubUserIE,
@@ -1549,7 +1477,6 @@ from .pornhub import (
from .pornotube import PornotubeIE from .pornotube import PornotubeIE
from .pornovoisines import PornoVoisinesIE from .pornovoisines import PornoVoisinesIE
from .pornoxo import PornoXOIE from .pornoxo import PornoXOIE
from .pornez import PornezIE
from .puhutv import ( from .puhutv import (
PuhuTVIE, PuhuTVIE,
PuhuTVSerieIE, PuhuTVSerieIE,
@@ -1593,7 +1520,6 @@ from .radiocomercial import (
) )
from .radiode import RadioDeIE from .radiode import RadioDeIE
from .radiojavan import RadioJavanIE from .radiojavan import RadioJavanIE
from .radiobremen import RadioBremenIE
from .radiofrance import ( from .radiofrance import (
FranceCultureIE, FranceCultureIE,
RadioFranceIE, RadioFranceIE,
@@ -1645,7 +1571,6 @@ from .rcti import (
RCTIPlusTVIE, RCTIPlusTVIE,
) )
from .rds import RDSIE from .rds import RDSIE
from .recurbate import RecurbateIE
from .redbee import ParliamentLiveUKIE, RTBFIE from .redbee import ParliamentLiveUKIE, RTBFIE
from .redbulltv import ( from .redbulltv import (
RedBullTVIE, RedBullTVIE,
@@ -1669,7 +1594,7 @@ from .restudy import RestudyIE
from .reuters import ReutersIE from .reuters import ReutersIE
from .reverbnation import ReverbNationIE from .reverbnation import ReverbNationIE
from .rheinmaintv import RheinMainTVIE from .rheinmaintv import RheinMainTVIE
from .rice import RICEIE from .rinsefm import RinseFMIE
from .rmcdecouverte import RMCDecouverteIE from .rmcdecouverte import RMCDecouverteIE
from .rockstargames import RockstarGamesIE from .rockstargames import RockstarGamesIE
from .rokfin import ( from .rokfin import (
@@ -1693,11 +1618,7 @@ from .rtlnl import (
RTLLuLiveIE, RTLLuLiveIE,
RTLLuRadioIE, RTLLuRadioIE,
) )
from .rtl2 import ( from .rtl2 import RTL2IE
RTL2IE,
RTL2YouIE,
RTL2YouSeriesIE,
)
from .rtnews import ( from .rtnews import (
RTNewsIE, RTNewsIE,
RTDocumentryIE, RTDocumentryIE,
@@ -1719,16 +1640,15 @@ from .rtve import (
RTVEInfantilIE, RTVEInfantilIE,
RTVETelevisionIE, RTVETelevisionIE,
) )
from .rtvnh import RTVNHIE
from .rtvs import RTVSIE from .rtvs import RTVSIE
from .rtvslo import RTVSLOIE from .rtvslo import RTVSLOIE
from .ruhd import RUHDIE
from .rule34video import Rule34VideoIE from .rule34video import Rule34VideoIE
from .rumble import ( from .rumble import (
RumbleEmbedIE, RumbleEmbedIE,
RumbleIE, RumbleIE,
RumbleChannelIE, RumbleChannelIE,
) )
from .rudovideo import RudoVideoIE
from .rutube import ( from .rutube import (
RutubeIE, RutubeIE,
RutubeChannelIE, RutubeChannelIE,
@@ -1804,10 +1724,6 @@ from .shahid import (
ShahidIE, ShahidIE,
ShahidShowIE, ShahidShowIE,
) )
from .shared import (
SharedIE,
VivoIE,
)
from .sharevideos import ShareVideosEmbedIE from .sharevideos import ShareVideosEmbedIE
from .sibnet import SibnetEmbedIE from .sibnet import SibnetEmbedIE
from .shemaroome import ShemarooMeIE from .shemaroome import ShemarooMeIE
@@ -1885,7 +1801,6 @@ from .spankbang import (
SpankBangIE, SpankBangIE,
SpankBangPlaylistIE, SpankBangPlaylistIE,
) )
from .spankwire import SpankwireIE
from .spiegel import SpiegelIE from .spiegel import SpiegelIE
from .spike import ( from .spike import (
BellatorIE, BellatorIE,
@@ -1935,7 +1850,6 @@ from .storyfire import (
StoryFireSeriesIE, StoryFireSeriesIE,
) )
from .streamable import StreamableIE from .streamable import StreamableIE
from .streamcloud import StreamcloudIE
from .streamcz import StreamCZIE from .streamcz import StreamCZIE
from .streamff import StreamFFIE from .streamff import StreamFFIE
from .streetvoice import StreetVoiceIE from .streetvoice import StreetVoiceIE
@@ -1955,7 +1869,6 @@ from .svt import (
SVTSeriesIE, SVTSeriesIE,
) )
from .swearnet import SwearnetEpisodeIE from .swearnet import SwearnetEpisodeIE
from .swrmediathek import SWRMediathekIE
from .syvdk import SYVDKIE from .syvdk import SYVDKIE
from .syfy import SyfyIE from .syfy import SyfyIE
from .sztvhu import SztvHuIE from .sztvhu import SztvHuIE
@@ -1982,7 +1895,6 @@ from .teamcoco import (
ConanClassicIE, ConanClassicIE,
) )
from .teamtreehouse import TeamTreeHouseIE from .teamtreehouse import TeamTreeHouseIE
from .techtalks import TechTalksIE
from .ted import ( from .ted import (
TedEmbedIE, TedEmbedIE,
TedPlaylistIE, TedPlaylistIE,
@@ -2024,6 +1936,10 @@ from .tenplay import (
from .testurl import TestURLIE from .testurl import TestURLIE
from .tf1 import TF1IE from .tf1 import TF1IE
from .tfo import TFOIE from .tfo import TFOIE
from .theguardian import (
TheGuardianPodcastIE,
TheGuardianPodcastPlaylistIE,
)
from .theholetv import TheHoleTvIE from .theholetv import TheHoleTvIE
from .theintercept import TheInterceptIE from .theintercept import TheInterceptIE
from .theplatform import ( from .theplatform import (
@@ -2055,7 +1971,6 @@ from .tiktok import (
TikTokLiveIE, TikTokLiveIE,
DouyinIE, DouyinIE,
) )
from .tinypic import TinyPicIE
from .tmz import TMZIE from .tmz import TMZIE
from .tnaflix import ( from .tnaflix import (
TNAFlixNetworkEmbedIE, TNAFlixNetworkEmbedIE,
@@ -2070,10 +1985,6 @@ from .toggle import (
from .toggo import ( from .toggo import (
ToggoIE, ToggoIE,
) )
from .tokentube import (
TokentubeIE,
TokentubeChannelIE
)
from .tonline import TOnlineIE from .tonline import TOnlineIE
from .toongoggles import ToonGogglesIE from .toongoggles import ToonGogglesIE
from .toutv import TouTvIE from .toutv import TouTvIE
@@ -2084,7 +1995,6 @@ from .triller import (
TrillerUserIE, TrillerUserIE,
TrillerShortIE, TrillerShortIE,
) )
from .trilulilu import TriluliluIE
from .trovo import ( from .trovo import (
TrovoIE, TrovoIE,
TrovoVodIE, TrovoVodIE,
@@ -2109,7 +2019,6 @@ from .tunein import (
TuneInPodcastEpisodeIE, TuneInPodcastEpisodeIE,
TuneInShortenerIE, TuneInShortenerIE,
) )
from .tunepk import TunePkIE
from .turbo import TurboIE from .turbo import TurboIE
from .tv2 import ( from .tv2 import (
TV2IE, TV2IE,
@@ -2151,16 +2060,7 @@ from .tvigle import TvigleIE
from .tviplayer import TVIPlayerIE from .tviplayer import TVIPlayerIE
from .tvland import TVLandIE from .tvland import TVLandIE
from .tvn24 import TVN24IE from .tvn24 import TVN24IE
from .tvnet import TVNetIE
from .tvnoe import TVNoeIE from .tvnoe import TVNoeIE
from .tvnow import (
TVNowIE,
TVNowFilmIE,
TVNowNewIE,
TVNowSeasonIE,
TVNowAnnualIE,
TVNowShowIE,
)
from .tvopengr import ( from .tvopengr import (
TVOpenGrWatchIE, TVOpenGrWatchIE,
TVOpenGrEmbedIE, TVOpenGrEmbedIE,
@@ -2178,7 +2078,6 @@ from .tvplay import (
) )
from .tvplayer import TVPlayerIE from .tvplayer import TVPlayerIE
from .tweakers import TweakersIE from .tweakers import TweakersIE
from .twentyfourvideo import TwentyFourVideoIE
from .twentymin import TwentyMinutenIE from .twentymin import TwentyMinutenIE
from .twentythreevideo import TwentyThreeVideoIE from .twentythreevideo import TwentyThreeVideoIE
from .twitcasting import ( from .twitcasting import (
@@ -2227,7 +2126,6 @@ from .drooble import DroobleIE
from .umg import UMGDeIE from .umg import UMGDeIE
from .unistra import UnistraIE from .unistra import UnistraIE
from .unity import UnityIE from .unity import UnityIE
from .unscripted import UnscriptedNewsVideoIE
from .unsupported import KnownDRMIE, KnownPiracyIE from .unsupported import KnownDRMIE, KnownPiracyIE
from .uol import UOLIE from .uol import UOLIE
from .uplynk import ( from .uplynk import (
@@ -2246,7 +2144,6 @@ from .ustudio import (
from .utreon import UtreonIE from .utreon import UtreonIE
from .varzesh3 import Varzesh3IE from .varzesh3 import Varzesh3IE
from .vbox7 import Vbox7IE from .vbox7 import Vbox7IE
from .veehd import VeeHDIE
from .veo import VeoIE from .veo import VeoIE
from .veoh import ( from .veoh import (
VeohIE, VeohIE,
@@ -2268,7 +2165,6 @@ from .vice import (
ViceArticleIE, ViceArticleIE,
ViceShowIE, ViceShowIE,
) )
from .vidbit import VidbitIE
from .viddler import ViddlerIE from .viddler import ViddlerIE
from .videa import VideaIE from .videa import VideaIE
from .videocampus_sachsen import ( from .videocampus_sachsen import (
@@ -2296,6 +2192,7 @@ from .vidio import (
VidioLiveIE VidioLiveIE
) )
from .vidlii import VidLiiIE from .vidlii import VidLiiIE
from .vidly import VidlyIE
from .viewlift import ( from .viewlift import (
ViewLiftIE, ViewLiftIE,
ViewLiftEmbedIE, ViewLiftEmbedIE,
@@ -2318,7 +2215,6 @@ from .vimm import (
VimmIE, VimmIE,
VimmRecordingIE, VimmRecordingIE,
) )
from .vimple import VimpleIE
from .vine import ( from .vine import (
VineIE, VineIE,
VineUserIE, VineUserIE,
@@ -2342,10 +2238,8 @@ from .vk import (
VKPlayLiveIE, VKPlayLiveIE,
) )
from .vocaroo import VocarooIE from .vocaroo import VocarooIE
from .vodlocker import VodlockerIE
from .vodpl import VODPlIE from .vodpl import VODPlIE
from .vodplatform import VODPlatformIE from .vodplatform import VODPlatformIE
from .voicerepublic import VoiceRepublicIE
from .voicy import ( from .voicy import (
VoicyIE, VoicyIE,
VoicyChannelIE, VoicyChannelIE,
@@ -2365,23 +2259,13 @@ from .vrt import (
KetnetIE, KetnetIE,
DagelijkseKostIE, DagelijkseKostIE,
) )
from .vrak import VrakIE
from .vrv import (
VRVIE,
VRVSeriesIE,
)
from .vshare import VShareIE
from .vtm import VTMIE from .vtm import VTMIE
from .medialaan import MedialaanIE from .medialaan import MedialaanIE
from .vuclip import VuClipIE from .vuclip import VuClipIE
from .vupload import VuploadIE
from .vvvvid import ( from .vvvvid import (
VVVVIDIE, VVVVIDIE,
VVVVIDShowIE, VVVVIDShowIE,
) )
from .vyborymos import VyboryMosIE
from .vzaar import VzaarIE
from .wakanim import WakanimIE
from .walla import WallaIE from .walla import WallaIE
from .washingtonpost import ( from .washingtonpost import (
WashingtonPostIE, WashingtonPostIE,
@@ -2393,8 +2277,6 @@ from .wasdtv import (
WASDTVClipIE, WASDTVClipIE,
) )
from .wat import WatIE from .wat import WatIE
from .watchbox import WatchBoxIE
from .watchindianporn import WatchIndianPornIE
from .wdr import ( from .wdr import (
WDRIE, WDRIE,
WDRPageIE, WDRPageIE,
@@ -2428,7 +2310,6 @@ from .wevidi import WeVidiIE
from .weyyak import WeyyakIE from .weyyak import WeyyakIE
from .whyp import WhypIE from .whyp import WhypIE
from .wikimedia import WikimediaIE from .wikimedia import WikimediaIE
from .willow import WillowIE
from .wimbledon import WimbledonIE from .wimbledon import WimbledonIE
from .wimtv import WimTVIE from .wimtv import WimTVIE
from .whowatch import WhoWatchIE from .whowatch import WhoWatchIE
@@ -2462,7 +2343,6 @@ from .wykop import (
WykopPostCommentIE, WykopPostCommentIE,
) )
from .xanimu import XanimuIE from .xanimu import XanimuIE
from .xbef import XBefIE
from .xboxclips import XboxClipsIE from .xboxclips import XboxClipsIE
from .xfileshare import XFileShareIE from .xfileshare import XFileShareIE
from .xhamster import ( from .xhamster import (
@@ -2478,8 +2358,6 @@ from .xinpianchang import XinpianchangIE
from .xminus import XMinusIE from .xminus import XMinusIE
from .xnxx import XNXXIE from .xnxx import XNXXIE
from .xstream import XstreamIE from .xstream import XstreamIE
from .xtube import XTubeUserIE, XTubeIE
from .xuite import XuiteIE
from .xvideos import ( from .xvideos import (
XVideosIE, XVideosIE,
XVideosQuickiesIE XVideosQuickiesIE
@@ -2509,10 +2387,7 @@ from .yappy import (
YappyIE, YappyIE,
YappyProfileIE, YappyProfileIE,
) )
from .yesjapan import YesJapanIE
from .yinyuetai import YinYueTaiIE
from .yle_areena import YleAreenaIE from .yle_areena import YleAreenaIE
from .ynet import YnetIE
from .youjizz import YouJizzIE from .youjizz import YouJizzIE
from .youku import ( from .youku import (
YoukuIE, YoukuIE,
@@ -2588,6 +2463,9 @@ from .zingmp3 import (
ZingMp3ChartMusicVideoIE, ZingMp3ChartMusicVideoIE,
ZingMp3UserIE, ZingMp3UserIE,
ZingMp3HubIE, ZingMp3HubIE,
ZingMp3LiveRadioIE,
ZingMp3PodcastEpisodeIE,
ZingMp3PodcastIE,
) )
from .zoom import ZoomIE from .zoom import ZoomIE
from .zype import ZypeIE from .zype import ZypeIE

View File

@@ -211,7 +211,8 @@ class AbemaTVIE(AbemaTVBaseIE):
'id': '194-25_s2_p1', 'id': '194-25_s2_p1',
'title': '第1話 「チーズケーキ」 「モーニング再び」', 'title': '第1話 「チーズケーキ」 「モーニング再び」',
'series': '異世界食堂2', 'series': '異世界食堂2',
'series_number': 2, 'season': 'シーズン2',
'season_number': 2,
'episode': '第1話 「チーズケーキ」 「モーニング再び」', 'episode': '第1話 「チーズケーキ」 「モーニング再び」',
'episode_number': 1, 'episode_number': 1,
}, },
@@ -347,12 +348,12 @@ class AbemaTVIE(AbemaTVBaseIE):
)? )?
''', r'\1', og_desc) ''', r'\1', og_desc)
# canonical URL may contain series and episode number # canonical URL may contain season and episode number
mobj = re.search(r's(\d+)_p(\d+)$', canonical_url) mobj = re.search(r's(\d+)_p(\d+)$', canonical_url)
if mobj: if mobj:
seri = int_or_none(mobj.group(1), default=float('inf')) seri = int_or_none(mobj.group(1), default=float('inf'))
epis = int_or_none(mobj.group(2), default=float('inf')) epis = int_or_none(mobj.group(2), default=float('inf'))
info['series_number'] = seri if seri < 100 else None info['season_number'] = seri if seri < 100 else None
# some anime like Detective Conan (though not available in AbemaTV) # some anime like Detective Conan (though not available in AbemaTV)
# has more than 1000 episodes (1026 as of 2021/11/15) # has more than 1000 episodes (1026 as of 2021/11/15)
info['episode_number'] = epis if epis < 2000 else None info['episode_number'] = epis if epis < 2000 else None
@@ -381,7 +382,7 @@ class AbemaTVIE(AbemaTVBaseIE):
self.report_warning('This is a premium-only stream') self.report_warning('This is a premium-only stream')
info.update(traverse_obj(api_response, { info.update(traverse_obj(api_response, {
'series': ('series', 'title'), 'series': ('series', 'title'),
'season': ('season', 'title'), 'season': ('season', 'name'),
'season_number': ('season', 'sequence'), 'season_number': ('season', 'sequence'),
'episode_number': ('episode', 'number'), 'episode_number': ('episode', 'number'),
})) }))

View File

@@ -121,11 +121,21 @@ class AENetworksIE(AENetworksBaseIE):
'info_dict': { 'info_dict': {
'id': '22253814', 'id': '22253814',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Winter is Coming', 'title': 'Winter Is Coming',
'description': 'md5:641f424b7a19d8e24f26dea22cf59d74', 'description': 'md5:a40e370925074260b1c8a633c632c63a',
'timestamp': 1338306241, 'timestamp': 1338306241,
'upload_date': '20120529', 'upload_date': '20120529',
'uploader': 'AENE-NEW', 'uploader': 'AENE-NEW',
'duration': 2592.0,
'thumbnail': r're:^https?://.*\.jpe?g$',
'chapters': 'count:5',
'tags': 'count:14',
'categories': ['Mountain Men'],
'episode_number': 1,
'episode': 'Episode 1',
'season': 'Season 1',
'season_number': 1,
'series': 'Mountain Men',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@@ -143,6 +153,15 @@ class AENetworksIE(AENetworksBaseIE):
'timestamp': 1452634428, 'timestamp': 1452634428,
'upload_date': '20160112', 'upload_date': '20160112',
'uploader': 'AENE-NEW', 'uploader': 'AENE-NEW',
'duration': 1277.695,
'thumbnail': r're:^https?://.*\.jpe?g$',
'chapters': 'count:4',
'tags': 'count:23',
'episode': 'Episode 1',
'episode_number': 1,
'season': 'Season 9',
'season_number': 9,
'series': 'Duck Dynasty',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download

View File

@@ -1,63 +0,0 @@
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
)
class AirMozillaIE(InfoExtractor):
_VALID_URL = r'https?://air\.mozilla\.org/(?P<id>[0-9a-z-]+)/?'
_TEST = {
'url': 'https://air.mozilla.org/privacy-lab-a-meetup-for-privacy-minded-people-in-san-francisco/',
'md5': '8d02f53ee39cf006009180e21df1f3ba',
'info_dict': {
'id': '6x4q2w',
'ext': 'mp4',
'title': 'Privacy Lab - a meetup for privacy minded people in San Francisco',
'thumbnail': r're:https?://.*/poster\.jpg',
'description': 'Brings together privacy professionals and others interested in privacy at for-profits, non-profits, and NGOs in an effort to contribute to the state of the ecosystem...',
'timestamp': 1422487800,
'upload_date': '20150128',
'location': 'SFO Commons',
'duration': 3780,
'view_count': int,
'categories': ['Main', 'Privacy'],
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._html_search_regex(r'//vid\.ly/(.*?)/embed', webpage, 'id')
embed_script = self._download_webpage('https://vid.ly/{0}/embed'.format(video_id), video_id)
jwconfig = self._parse_json(self._search_regex(
r'initCallback\((.*)\);', embed_script, 'metadata'), video_id)['config']
info_dict = self._parse_jwplayer_data(jwconfig, video_id)
view_count = int_or_none(self._html_search_regex(
r'Views since archived: ([0-9]+)',
webpage, 'view count', fatal=False))
timestamp = parse_iso8601(self._html_search_regex(
r'<time datetime="(.*?)"', webpage, 'timestamp', fatal=False))
duration = parse_duration(self._search_regex(
r'Duration:\s*(\d+\s*hours?\s*\d+\s*minutes?)',
webpage, 'duration', fatal=False))
info_dict.update({
'id': video_id,
'title': self._og_search_title(webpage),
'url': self._og_search_url(webpage),
'display_id': display_id,
'description': self._og_search_description(webpage),
'timestamp': timestamp,
'location': self._html_search_regex(r'Location: (.*)', webpage, 'location', default=None),
'duration': duration,
'view_count': view_count,
'categories': re.findall(r'<a href=".*?" class="channel">(.*?)</a>', webpage),
})
return info_dict

253
yt_dlp/extractor/allstar.py Normal file
View File

@@ -0,0 +1,253 @@
import functools
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
OnDemandPagedList,
int_or_none,
join_nonempty,
parse_qs,
urljoin,
)
from ..utils.traversal import traverse_obj
_FIELDS = '''
_id
clipImageSource
clipImageThumb
clipLink
clipTitle
createdDate
shareId
user { _id }
username
views'''
_EXTRA_FIELDS = '''
clipLength
clipSizeBytes'''
_QUERIES = {
'clip': '''query ($id: String!) {
video: getClip(clipIdentifier: $id) {
%s %s
}
}''' % (_FIELDS, _EXTRA_FIELDS),
'montage': '''query ($id: String!) {
video: getMontage(clipIdentifier: $id) {
%s
}
}''' % _FIELDS,
'Clips': '''query ($page: Int!, $user: String!, $game: Int) {
videos: clips(search: createdDate, page: $page, user: $user, mobile: false, game: $game) {
data { %s %s }
}
}''' % (_FIELDS, _EXTRA_FIELDS),
'Montages': '''query ($page: Int!, $user: String!) {
videos: montages(search: createdDate, page: $page, user: $user) {
data { %s }
}
}''' % _FIELDS,
'Mobile Clips': '''query ($page: Int!, $user: String!) {
videos: clips(search: createdDate, page: $page, user: $user, mobile: true) {
data { %s %s }
}
}''' % (_FIELDS, _EXTRA_FIELDS),
}
class AllstarBaseIE(InfoExtractor):
@staticmethod
def _parse_video_data(video_data):
def media_url_or_none(path):
return urljoin('https://media.allstar.gg/', path)
info = traverse_obj(video_data, {
'id': ('_id', {str}),
'display_id': ('shareId', {str}),
'title': ('clipTitle', {str}),
'url': ('clipLink', {media_url_or_none}),
'thumbnails': (('clipImageThumb', 'clipImageSource'), {'url': {media_url_or_none}}),
'duration': ('clipLength', {int_or_none}),
'filesize': ('clipSizeBytes', {int_or_none}),
'timestamp': ('createdDate', {functools.partial(int_or_none, scale=1000)}),
'uploader': ('username', {str}),
'uploader_id': ('user', '_id', {str}),
'view_count': ('views', {int_or_none}),
})
if info.get('id') and info.get('url'):
basename = 'clip' if '/clips/' in info['url'] else 'montage'
info['webpage_url'] = f'https://allstar.gg/{basename}?{basename}={info["id"]}'
info.update({
'extractor_key': AllstarIE.ie_key(),
'extractor': AllstarIE.IE_NAME,
'uploader_url': urljoin('https://allstar.gg/u/', info.get('uploader_id')),
})
return info
def _call_api(self, query, variables, path, video_id=None, note=None):
response = self._download_json(
'https://a1.allstar.gg/graphql', video_id, note=note,
headers={'content-type': 'application/json'},
data=json.dumps({'variables': variables, 'query': query}).encode())
errors = traverse_obj(response, ('errors', ..., 'message', {str}))
if errors:
raise ExtractorError('; '.join(errors))
return traverse_obj(response, path)
class AllstarIE(AllstarBaseIE):
_VALID_URL = r'https?://(?:www\.)?allstar\.gg/(?P<type>(?:clip|montage))\?(?P=type)=(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://allstar.gg/clip?clip=64482c2da9eec30008a67d1b',
'info_dict': {
'id': '64482c2da9eec30008a67d1b',
'title': '4K on Inferno',
'url': 'md5:66befb5381eef0c9456026386c25fa55',
'thumbnail': r're:https://media\.allstar\.gg/.+\.(?:png|jpg)$',
'uploader': 'chrk.',
'ext': 'mp4',
'duration': 20,
'filesize': 21199257,
'timestamp': 1682451501,
'uploader_id': '62b8bdfc9021052f7905882d',
'uploader_url': 'https://allstar.gg/u/62b8bdfc9021052f7905882d',
'upload_date': '20230425',
'view_count': int,
}
}, {
'url': 'https://allstar.gg/clip?clip=8LJLY4JKB',
'info_dict': {
'id': '64a1ec6b887f4c0008dc50b8',
'display_id': '8LJLY4JKB',
'title': 'AK-47 3K on Mirage',
'url': 'md5:dde224fd12f035c0e2529a4ae34c4283',
'ext': 'mp4',
'thumbnail': r're:https://media\.allstar\.gg/.+\.(?:png|jpg)$',
'duration': 16,
'filesize': 30175859,
'timestamp': 1688333419,
'uploader': 'cherokee',
'uploader_id': '62b8bdfc9021052f7905882d',
'uploader_url': 'https://allstar.gg/u/62b8bdfc9021052f7905882d',
'upload_date': '20230702',
'view_count': int,
}
}, {
'url': 'https://allstar.gg/montage?montage=643e64089da7e9363e1fa66c',
'info_dict': {
'id': '643e64089da7e9363e1fa66c',
'display_id': 'APQLGM2IMXW',
'title': 'cherokee Rapid Fire Snipers Montage',
'url': 'md5:a3ee356022115db2b27c81321d195945',
'thumbnail': r're:https://media\.allstar\.gg/.+\.(?:png|jpg)$',
'ext': 'mp4',
'timestamp': 1681810448,
'uploader': 'cherokee',
'uploader_id': '62b8bdfc9021052f7905882d',
'uploader_url': 'https://allstar.gg/u/62b8bdfc9021052f7905882d',
'upload_date': '20230418',
'view_count': int,
}
}, {
'url': 'https://allstar.gg/montage?montage=RILJMH6QOS',
'info_dict': {
'id': '64a2697372ce3703de29e868',
'display_id': 'RILJMH6QOS',
'title': 'cherokee Rapid Fire Snipers Montage',
'url': 'md5:d5672e6f88579730c2310a80fdbc4030',
'thumbnail': r're:https://media\.allstar\.gg/.+\.(?:png|jpg)$',
'ext': 'mp4',
'timestamp': 1688365434,
'uploader': 'cherokee',
'uploader_id': '62b8bdfc9021052f7905882d',
'uploader_url': 'https://allstar.gg/u/62b8bdfc9021052f7905882d',
'upload_date': '20230703',
'view_count': int,
}
}]
def _real_extract(self, url):
query_id, video_id = self._match_valid_url(url).group('type', 'id')
return self._parse_video_data(
self._call_api(
_QUERIES.get(query_id), {'id': video_id}, ('data', 'video'), video_id))
class AllstarProfileIE(AllstarBaseIE):
_VALID_URL = r'https?://(?:www\.)?allstar\.gg/(?:profile\?user=|u/)(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://allstar.gg/profile?user=62b8bdfc9021052f7905882d',
'info_dict': {
'id': '62b8bdfc9021052f7905882d-clips',
'title': 'cherokee - Clips',
},
'playlist_mincount': 15
}, {
'url': 'https://allstar.gg/u/cherokee?game=730&view=Clips',
'info_dict': {
'id': '62b8bdfc9021052f7905882d-clips-730',
'title': 'cherokee - Clips - 730',
},
'playlist_mincount': 15
}, {
'url': 'https://allstar.gg/u/62b8bdfc9021052f7905882d?view=Montages',
'info_dict': {
'id': '62b8bdfc9021052f7905882d-montages',
'title': 'cherokee - Montages',
},
'playlist_mincount': 4
}, {
'url': 'https://allstar.gg/profile?user=cherokee&view=Mobile Clips',
'info_dict': {
'id': '62b8bdfc9021052f7905882d-mobile',
'title': 'cherokee - Mobile Clips',
},
'playlist_mincount': 1
}]
_PAGE_SIZE = 10
def _get_page(self, user_id, display_id, game, query, page_num):
page_num += 1
for video_data in self._call_api(
query, {
'user': user_id,
'page': page_num,
'game': game,
}, ('data', 'videos', 'data'), display_id, f'Downloading page {page_num}'):
yield self._parse_video_data(video_data)
def _real_extract(self, url):
display_id = self._match_id(url)
profile_data = self._download_json(
urljoin('https://api.allstar.gg/v1/users/profile/', display_id), display_id)
user_id = traverse_obj(profile_data, ('data', ('_id'), {str}))
if not user_id:
raise ExtractorError('Unable to extract the user id')
username = traverse_obj(profile_data, ('data', 'profile', ('username'), {str}))
url_query = parse_qs(url)
game = traverse_obj(url_query, ('game', 0, {int_or_none}))
query_id = traverse_obj(url_query, ('view', 0), default='Clips')
if query_id not in ('Clips', 'Montages', 'Mobile Clips'):
raise ExtractorError(f'Unsupported playlist URL type {query_id!r}')
return self.playlist_result(
OnDemandPagedList(
functools.partial(
self._get_page, user_id, display_id, game, _QUERIES.get(query_id)), self._PAGE_SIZE),
playlist_id=join_nonempty(user_id, query_id.lower().split()[0], game),
playlist_title=join_nonempty((username or display_id), query_id, game, delim=' - '))

View File

@@ -0,0 +1,96 @@
import re
from .archiveorg import ArchiveOrgIE
from .common import InfoExtractor
from ..utils import (
InAdvancePagedList,
int_or_none,
orderedSet,
str_to_int,
urljoin,
)
class AltCensoredIE(InfoExtractor):
IE_NAME = 'altcensored'
_VALID_URL = r'https?://(?:www\.)?altcensored\.com/(?:watch\?v=|embed/)(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.altcensored.com/watch?v=k0srjLSkga8',
'info_dict': {
'id': 'youtube-k0srjLSkga8',
'ext': 'webm',
'title': "QUELLES SONT LES CONSÉQUENCES DE L'HYPERSEXUALISATION DE LA SOCIÉTÉ ?",
'display_id': 'k0srjLSkga8.webm',
'release_date': '20180403',
'creator': 'Virginie Vota',
'release_year': 2018,
'upload_date': '20230318',
'uploader': 'admin@altcensored.com',
'description': 'md5:0b38a8fc04103579d5c1db10a247dc30',
'timestamp': 1679161343,
'track': 'k0srjLSkga8',
'duration': 926.09,
'thumbnail': 'https://archive.org/download/youtube-k0srjLSkga8/youtube-k0srjLSkga8.thumbs/k0srjLSkga8_000925.jpg',
'view_count': int,
'categories': ['News & Politics'],
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
return {
'_type': 'url_transparent',
'url': f'https://archive.org/details/youtube-{video_id}',
'ie_key': ArchiveOrgIE.ie_key(),
'view_count': str_to_int(self._html_search_regex(
r'YouTube Views:(?:\s|&nbsp;)*([\d,]+)', webpage, 'view count', default=None)),
'categories': self._html_search_regex(
r'<a href="/category/\d+">\s*\n?\s*([^<]+)</a>',
webpage, 'category', default='').split() or None,
}
class AltCensoredChannelIE(InfoExtractor):
IE_NAME = 'altcensored:channel'
_VALID_URL = r'https?://(?:www\.)?altcensored\.com/channel/(?!page|table)(?P<id>[^/?#]+)'
_PAGE_SIZE = 24
_TESTS = [{
'url': 'https://www.altcensored.com/channel/UCFPTO55xxHqFqkzRZHu4kcw',
'info_dict': {
'title': 'Virginie Vota',
'id': 'UCFPTO55xxHqFqkzRZHu4kcw',
},
'playlist_count': 91
}, {
'url': 'https://altcensored.com/channel/UC9CcJ96HKMWn0LZlcxlpFTw',
'info_dict': {
'title': 'yukikaze775',
'id': 'UC9CcJ96HKMWn0LZlcxlpFTw',
},
'playlist_count': 4
}]
def _real_extract(self, url):
channel_id = self._match_id(url)
webpage = self._download_webpage(
url, channel_id, 'Download channel webpage', 'Unable to get channel webpage')
title = self._html_search_meta('altcen_title', webpage, 'title', fatal=False)
page_count = int_or_none(self._html_search_regex(
r'<a[^>]+href="/channel/\w+/page/(\d+)">(?:\1)</a>',
webpage, 'page count', default='1'))
def page_func(page_num):
page_num += 1
webpage = self._download_webpage(
f'https://altcensored.com/channel/{channel_id}/page/{page_num}',
channel_id, note=f'Downloading page {page_num}')
items = re.findall(r'<a[^>]+href="(/watch\?v=[^"]+)', webpage)
return [self.url_result(urljoin('https://www.altcensored.com', path), AltCensoredIE)
for path in orderedSet(items)]
return self.playlist_result(
InAdvancePagedList(page_func, page_count, self._PAGE_SIZE),
playlist_id=channel_id, playlist_title=title)

View File

@@ -10,6 +10,7 @@ from ..utils import (
class AolIE(YahooIE): # XXX: Do not subclass from concrete IE class AolIE(YahooIE): # XXX: Do not subclass from concrete IE
_WORKING = False
IE_NAME = 'aol.com' IE_NAME = 'aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>\d{9}|[0-9a-f]{24}|[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12})' _VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>\d{9}|[0-9a-f]{24}|[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12})'

View File

@@ -52,7 +52,6 @@ class ArchiveOrgIE(InfoExtractor):
'creator': 'SRI International', 'creator': 'SRI International',
'uploader': 'laura@archive.org', 'uploader': 'laura@archive.org',
'thumbnail': r're:https://archive\.org/download/.*\.jpg', 'thumbnail': r're:https://archive\.org/download/.*\.jpg',
'release_year': 1968,
'display_id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect.cdr', 'display_id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect.cdr',
'track': 'XD300-23 68HighlightsAResearchCntAugHumanIntellect', 'track': 'XD300-23 68HighlightsAResearchCntAugHumanIntellect',
@@ -134,7 +133,6 @@ class ArchiveOrgIE(InfoExtractor):
'album': '1977-05-08 - Barton Hall - Cornell University', 'album': '1977-05-08 - Barton Hall - Cornell University',
'release_date': '19770508', 'release_date': '19770508',
'display_id': 'gd1977-05-08d01t07.flac', 'display_id': 'gd1977-05-08d01t07.flac',
'release_year': 1977,
'track_number': 7, 'track_number': 7,
}, },
}, { }, {

View File

@@ -1,24 +1,23 @@
import json
import re import re
from functools import partial
from .common import InfoExtractor from .common import InfoExtractor
from .generic import GenericIE
from ..utils import ( from ..utils import (
OnDemandPagedList,
determine_ext, determine_ext,
ExtractorError,
int_or_none, int_or_none,
join_nonempty,
make_archive_id,
parse_duration, parse_duration,
qualities, parse_iso8601,
remove_start,
str_or_none, str_or_none,
try_get,
unified_strdate, unified_strdate,
unified_timestamp,
update_url,
update_url_query, update_url_query,
url_or_none, url_or_none,
xpath_text, xpath_text,
) )
from ..compat import compat_etree_fromstring from ..utils.traversal import traverse_obj
class ARDMediathekBaseIE(InfoExtractor): class ARDMediathekBaseIE(InfoExtractor):
@@ -61,45 +60,6 @@ class ARDMediathekBaseIE(InfoExtractor):
'subtitles': subtitles, 'subtitles': subtitles,
} }
def _ARD_extract_episode_info(self, title):
"""Try to extract season/episode data from the title."""
res = {}
if not title:
return res
for pattern in [
# Pattern for title like "Homo sapiens (S06/E07) - Originalversion"
# from: https://www.ardmediathek.de/one/sendung/doctor-who/Y3JpZDovL3dkci5kZS9vbmUvZG9jdG9yIHdobw
r'.*(?P<ep_info> \(S(?P<season_number>\d+)/E(?P<episode_number>\d+)\)).*',
# E.g.: title="Fritjof aus Norwegen (2) (AD)"
# from: https://www.ardmediathek.de/ard/sammlung/der-krieg-und-ich/68cMkqJdllm639Skj4c7sS/
r'.*(?P<ep_info> \((?:Folge |Teil )?(?P<episode_number>\d+)(?:/\d+)?\)).*',
r'.*(?P<ep_info>Folge (?P<episode_number>\d+)(?:\:| -|) )\"(?P<episode>.+)\".*',
# E.g.: title="Folge 25/42: Symmetrie"
# from: https://www.ardmediathek.de/ard/video/grips-mathe/folge-25-42-symmetrie/ard-alpha/Y3JpZDovL2JyLmRlL3ZpZGVvLzMyYzI0ZjczLWQ1N2MtNDAxNC05ZmZhLTFjYzRkZDA5NDU5OQ/
# E.g.: title="Folge 1063 - Vertrauen"
# from: https://www.ardmediathek.de/ard/sendung/die-fallers/Y3JpZDovL3N3ci5kZS8yMzAyMDQ4/
r'.*(?P<ep_info>Folge (?P<episode_number>\d+)(?:/\d+)?(?:\:| -|) ).*',
]:
m = re.match(pattern, title)
if m:
groupdict = m.groupdict()
res['season_number'] = int_or_none(groupdict.get('season_number'))
res['episode_number'] = int_or_none(groupdict.get('episode_number'))
res['episode'] = str_or_none(groupdict.get('episode'))
# Build the episode title by removing numeric episode information:
if groupdict.get('ep_info') and not res['episode']:
res['episode'] = str_or_none(
title.replace(groupdict.get('ep_info'), ''))
if res['episode']:
res['episode'] = res['episode'].strip()
break
# As a fallback use the whole title as the episode name:
if not res.get('episode'):
res['episode'] = title.strip()
return res
def _extract_formats(self, media_info, video_id): def _extract_formats(self, media_info, video_id):
type_ = media_info.get('_type') type_ = media_info.get('_type')
media_array = media_info.get('_mediaArray', []) media_array = media_info.get('_mediaArray', [])
@@ -155,144 +115,12 @@ class ARDMediathekBaseIE(InfoExtractor):
return formats return formats
class ARDMediathekIE(ARDMediathekBaseIE):
IE_NAME = 'ARD:mediathek'
_VALID_URL = r'^https?://(?:(?:(?:www|classic)\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de|one\.ard\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
_TESTS = [{
# available till 26.07.2022
'url': 'http://www.ardmediathek.de/tv/S%C3%9CDLICHT/Was-ist-die-Kunst-der-Zukunft-liebe-Ann/BR-Fernsehen/Video?bcastId=34633636&documentId=44726822',
'info_dict': {
'id': '44726822',
'ext': 'mp4',
'title': 'Was ist die Kunst der Zukunft, liebe Anna McCarthy?',
'description': 'md5:4ada28b3e3b5df01647310e41f3a62f5',
'duration': 1740,
},
'params': {
# m3u8 download
'skip_download': True,
}
}, {
'url': 'https://one.ard.de/tv/Mord-mit-Aussicht/Mord-mit-Aussicht-6-39-T%C3%B6dliche-Nach/ONE/Video?bcastId=46384294&documentId=55586872',
'only_matching': True,
}, {
# audio
'url': 'http://www.ardmediathek.de/tv/WDR-H%C3%B6rspiel-Speicher/Tod-eines-Fu%C3%9Fballers/WDR-3/Audio-Podcast?documentId=28488308&bcastId=23074086',
'only_matching': True,
}, {
'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
'only_matching': True,
}, {
# audio
'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
'only_matching': True,
}, {
'url': 'https://classic.ardmediathek.de/tv/Panda-Gorilla-Co/Panda-Gorilla-Co-Folge-274/Das-Erste/Video?bcastId=16355486&documentId=58234698',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if ARDBetaMediathekIE.suitable(url) else super(ARDMediathekIE, cls).suitable(url)
def _real_extract(self, url):
# determine video id from url
m = self._match_valid_url(url)
document_id = None
numid = re.search(r'documentId=([0-9]+)', url)
if numid:
document_id = video_id = numid.group(1)
else:
video_id = m.group('video_id')
webpage = self._download_webpage(url, video_id)
ERRORS = (
('>Leider liegt eine Störung vor.', 'Video %s is unavailable'),
('>Der gewünschte Beitrag ist nicht mehr verfügbar.<',
'Video %s is no longer available'),
)
for pattern, message in ERRORS:
if pattern in webpage:
raise ExtractorError(message % video_id, expected=True)
if re.search(r'[\?&]rss($|[=&])', url):
doc = compat_etree_fromstring(webpage.encode('utf-8'))
if doc.tag == 'rss':
return GenericIE()._extract_rss(url, video_id, doc)
title = self._og_search_title(webpage, default=None) or self._html_search_regex(
[r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>',
r'<meta name="dcterms\.title" content="(.*?)"/>',
r'<h4 class="headline">(.*?)</h4>',
r'<title[^>]*>(.*?)</title>'],
webpage, 'title')
description = self._og_search_description(webpage, default=None) or self._html_search_meta(
'dcterms.abstract', webpage, 'description', default=None)
if description is None:
description = self._html_search_meta(
'description', webpage, 'meta description', default=None)
if description is None:
description = self._html_search_regex(
r'<p\s+class="teasertext">(.+?)</p>',
webpage, 'teaser text', default=None)
# Thumbnail is sometimes not present.
# It is in the mobile version, but that seems to use a different URL
# structure altogether.
thumbnail = self._og_search_thumbnail(webpage, default=None)
media_streams = re.findall(r'''(?x)
mediaCollection\.addMediaStream\([0-9]+,\s*[0-9]+,\s*"[^"]*",\s*
"([^"]+)"''', webpage)
if media_streams:
QUALITIES = qualities(['lo', 'hi', 'hq'])
formats = []
for furl in set(media_streams):
if furl.endswith('.f4m'):
fid = 'f4m'
else:
fid_m = re.match(r'.*\.([^.]+)\.[^.]+$', furl)
fid = fid_m.group(1) if fid_m else None
formats.append({
'quality': QUALITIES(fid),
'format_id': fid,
'url': furl,
})
info = {
'formats': formats,
}
else: # request JSON file
if not document_id:
video_id = self._search_regex(
(r'/play/(?:config|media|sola)/(\d+)', r'contentId["\']\s*:\s*(\d+)'),
webpage, 'media id', default=None)
info = self._extract_media_info(
'http://www.ardmediathek.de/play/media/%s' % video_id,
webpage, video_id)
info.update({
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
})
info.update(self._ARD_extract_episode_info(info['title']))
return info
class ARDIE(InfoExtractor): class ARDIE(InfoExtractor):
_VALID_URL = r'(?P<mainurl>https?://(?:www\.)?daserste\.de/(?:[^/?#&]+/)+(?P<id>[^/?#&]+))\.html' _VALID_URL = r'(?P<mainurl>https?://(?:www\.)?daserste\.de/(?:[^/?#&]+/)+(?P<id>[^/?#&]+))\.html'
_TESTS = [{ _TESTS = [{
# available till 7.12.2023 # available till 7.12.2023
'url': 'https://www.daserste.de/information/talk/maischberger/videos/maischberger-video-424.html', 'url': 'https://www.daserste.de/information/talk/maischberger/videos/maischberger-video-424.html',
'md5': 'a438f671e87a7eba04000336a119ccc4', 'md5': '94812e6438488fb923c361a44469614b',
'info_dict': { 'info_dict': {
'id': 'maischberger-video-424', 'id': 'maischberger-video-424',
'display_id': 'maischberger-video-424', 'display_id': 'maischberger-video-424',
@@ -399,31 +227,35 @@ class ARDIE(InfoExtractor):
} }
class ARDBetaMediathekIE(ARDMediathekBaseIE): class ARDBetaMediathekIE(InfoExtractor):
IE_NAME = 'ARDMediathek'
_VALID_URL = r'''(?x)https:// _VALID_URL = r'''(?x)https://
(?:(?:beta|www)\.)?ardmediathek\.de/ (?:(?:beta|www)\.)?ardmediathek\.de/
(?:(?P<client>[^/]+)/)? (?:[^/]+/)?
(?:player|live|video|(?P<playlist>sendung|sammlung))/ (?:player|live|video)/
(?:(?P<display_id>(?(playlist)[^?#]+?|[^?#]+))/)? (?:(?P<display_id>[^?#]+)/)?
(?P<id>(?(playlist)|Y3JpZDovL)[a-zA-Z0-9]+) (?P<id>[a-zA-Z0-9]+)
(?(playlist)/(?P<season>\d+)?/?(?:[?#]|$))''' /?(?:[?#]|$)'''
_GEO_COUNTRIES = ['DE']
_TESTS = [{ _TESTS = [{
'url': 'https://www.ardmediathek.de/video/filme-im-mdr/wolfsland-die-traurigen-schwestern/mdr-fernsehen/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy8xZGY0ZGJmZS00ZWQwLTRmMGItYjhhYy0wOGQ4ZmYxNjVhZDI', 'url': 'https://www.ardmediathek.de/video/filme-im-mdr/liebe-auf-vier-pfoten/mdr-fernsehen/Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0',
'md5': '3fd5fead7a370a819341129c8d713136', 'md5': 'b6e8ab03f2bcc6e1f9e6cef25fcc03c4',
'info_dict': { 'info_dict': {
'display_id': 'filme-im-mdr/wolfsland-die-traurigen-schwestern/mdr-fernsehen', 'display_id': 'filme-im-mdr/liebe-auf-vier-pfoten/mdr-fernsehen',
'id': '12172961', 'id': 'Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0',
'title': 'Wolfsland - Die traurigen Schwestern', 'title': 'Liebe auf vier Pfoten',
'description': r're:^Als der Polizeiobermeister Raaben', 'description': r're:^Claudia Schmitt, Anwältin in Salzburg',
'duration': 5241, 'duration': 5222,
'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:efa186f7b0054957', 'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:aee7cbf8f06de976?w=960&ch=ae4d0f2ee47d8b9b',
'timestamp': 1670710500, 'timestamp': 1701343800,
'upload_date': '20221210', 'upload_date': '20231130',
'ext': 'mp4', 'ext': 'mp4',
'age_limit': 12, 'episode': 'Liebe auf vier Pfoten',
'episode': 'Wolfsland - Die traurigen Schwestern', 'series': 'Filme im MDR',
'series': 'Filme im MDR' 'age_limit': 0,
'channel': 'MDR',
'_old_archive_ids': ['ardbetamediathek 12939099'],
}, },
}, { }, {
'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/', 'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/',
@@ -444,7 +276,7 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
'url': 'https://www.ardmediathek.de/video/tagesschau-oder-tagesschau-20-00-uhr/das-erste/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll', 'url': 'https://www.ardmediathek.de/video/tagesschau-oder-tagesschau-20-00-uhr/das-erste/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll',
'md5': '1e73ded21cb79bac065117e80c81dc88', 'md5': '1e73ded21cb79bac065117e80c81dc88',
'info_dict': { 'info_dict': {
'id': '10049223', 'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll',
'ext': 'mp4', 'ext': 'mp4',
'title': 'tagesschau, 20:00 Uhr', 'title': 'tagesschau, 20:00 Uhr',
'timestamp': 1636398000, 'timestamp': 1636398000,
@@ -454,7 +286,27 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
'duration': 915, 'duration': 915,
'episode': 'tagesschau, 20:00 Uhr', 'episode': 'tagesschau, 20:00 Uhr',
'series': 'tagesschau', 'series': 'tagesschau',
'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:fbb21142783b0a49', 'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:fbb21142783b0a49?w=960&ch=ee69108ae344f678',
'channel': 'ARD-Aktuell',
'_old_archive_ids': ['ardbetamediathek 10049223'],
},
}, {
'url': 'https://www.ardmediathek.de/video/7-tage/7-tage-unter-harten-jungs/hr-fernsehen/N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3',
'md5': 'c428b9effff18ff624d4f903bda26315',
'info_dict': {
'id': 'N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3',
'ext': 'mp4',
'duration': 2700,
'episode': '7 Tage ... unter harten Jungs',
'description': 'md5:0f215470dcd2b02f59f4bd10c963f072',
'upload_date': '20231005',
'timestamp': 1696491171,
'display_id': '7-tage/7-tage-unter-harten-jungs/hr-fernsehen',
'series': '7 Tage ...',
'channel': 'HR',
'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:f6e6d5ffac41925c?w=960&ch=fa32ba69bc87989a',
'title': '7 Tage ... unter harten Jungs',
'_old_archive_ids': ['ardbetamediathek 94834686'],
}, },
}, { }, {
'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE', 'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
@@ -471,203 +323,230 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
}, { }, {
'url': 'https://www.ardmediathek.de/swr/live/Y3JpZDovL3N3ci5kZS8xMzQ4MTA0Mg', 'url': 'https://www.ardmediathek.de/swr/live/Y3JpZDovL3N3ci5kZS8xMzQ4MTA0Mg',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.ardmediathek.de/video/coronavirus-update-ndr-info/astrazeneca-kurz-lockdown-und-pims-syndrom-81/ndr/Y3JpZDovL25kci5kZS84NzE0M2FjNi0wMWEwLTQ5ODEtOTE5NS1mOGZhNzdhOTFmOTI/',
'only_matching': True,
}]
def _extract_episode_info(self, title):
patterns = [
# Pattern for title like "Homo sapiens (S06/E07) - Originalversion"
# from: https://www.ardmediathek.de/one/sendung/doctor-who/Y3JpZDovL3dkci5kZS9vbmUvZG9jdG9yIHdobw
r'.*(?P<ep_info> \(S(?P<season_number>\d+)/E(?P<episode_number>\d+)\)).*',
# E.g.: title="Fritjof aus Norwegen (2) (AD)"
# from: https://www.ardmediathek.de/ard/sammlung/der-krieg-und-ich/68cMkqJdllm639Skj4c7sS/
r'.*(?P<ep_info> \((?:Folge |Teil )?(?P<episode_number>\d+)(?:/\d+)?\)).*',
r'.*(?P<ep_info>Folge (?P<episode_number>\d+)(?:\:| -|) )\"(?P<episode>.+)\".*',
# E.g.: title="Folge 25/42: Symmetrie"
# from: https://www.ardmediathek.de/ard/video/grips-mathe/folge-25-42-symmetrie/ard-alpha/Y3JpZDovL2JyLmRlL3ZpZGVvLzMyYzI0ZjczLWQ1N2MtNDAxNC05ZmZhLTFjYzRkZDA5NDU5OQ/
# E.g.: title="Folge 1063 - Vertrauen"
# from: https://www.ardmediathek.de/ard/sendung/die-fallers/Y3JpZDovL3N3ci5kZS8yMzAyMDQ4/
r'.*(?P<ep_info>Folge (?P<episode_number>\d+)(?:/\d+)?(?:\:| -|) ).*',
# As a fallback use the full title
r'(?P<title>.*)',
]
return traverse_obj(patterns, (..., {partial(re.match, string=title)}, {
'season_number': ('season_number', {int_or_none}),
'episode_number': ('episode_number', {int_or_none}),
'episode': ((
('episode', {str_or_none}),
('ep_info', {lambda x: title.replace(x, '')}),
('title', {str}),
), {str.strip}),
}), get_all=False)
def _real_extract(self, url):
video_id, display_id = self._match_valid_url(url).group('id', 'display_id')
page_data = self._download_json(
f'https://api.ardmediathek.de/page-gateway/pages/ard/item/{video_id}', video_id, query={
'embedded': 'false',
'mcV6': 'true',
})
player_data = traverse_obj(
page_data, ('widgets', lambda _, v: v['type'] in ('player_ondemand', 'player_live'), {dict}), get_all=False)
is_live = player_data.get('type') == 'player_live'
media_data = traverse_obj(player_data, ('mediaCollection', 'embedded', {dict}))
if player_data.get('blockedByFsk'):
self.raise_no_formats('This video is only available after 22:00', expected=True)
formats = []
subtitles = {}
for stream in traverse_obj(media_data, ('streams', ..., {dict})):
kind = stream.get('kind')
# Prioritize main stream over sign language and others
preference = 1 if kind == 'main' else None
for media in traverse_obj(stream, ('media', lambda _, v: url_or_none(v['url']))):
media_url = media['url']
audio_kind = traverse_obj(media, (
'audios', 0, 'kind', {str}), default='').replace('standard', '')
lang_code = traverse_obj(media, ('audios', 0, 'languageCode', {str})) or 'deu'
lang = join_nonempty(lang_code, audio_kind)
language_preference = 10 if lang == 'deu' else -10
if determine_ext(media_url) == 'm3u8':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
media_url, video_id, m3u8_id=f'hls-{kind}', preference=preference, fatal=False, live=is_live)
for f in fmts:
f['language'] = lang
f['language_preference'] = language_preference
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({
'url': media_url,
'format_id': f'http-{kind}',
'preference': preference,
'language': lang,
'language_preference': language_preference,
**traverse_obj(media, {
'format_note': ('forcedLabel', {str}),
'width': ('maxHResolutionPx', {int_or_none}),
'height': ('maxVResolutionPx', {int_or_none}),
'vcodec': ('videoCodec', {str}),
}),
})
for sub in traverse_obj(media_data, ('subtitles', ..., {dict})):
for sources in traverse_obj(sub, ('sources', lambda _, v: url_or_none(v['url']))):
subtitles.setdefault(sub.get('languageCode') or 'deu', []).append({
'url': sources['url'],
'ext': {'webvtt': 'vtt', 'ebutt': 'ttml'}.get(sources.get('kind')),
})
age_limit = traverse_obj(page_data, ('fskRating', {lambda x: remove_start(x, 'FSK')}, {int_or_none}))
old_id = traverse_obj(page_data, ('tracking', 'atiCustomVars', 'contentId'))
return {
'id': video_id,
'display_id': display_id,
'formats': formats,
'subtitles': subtitles,
'is_live': is_live,
'age_limit': age_limit,
**traverse_obj(media_data, ('meta', {
'title': 'title',
'description': 'synopsis',
'timestamp': ('broadcastedOnDateTime', {parse_iso8601}),
'series': 'seriesTitle',
'thumbnail': ('images', 0, 'url', {url_or_none}),
'duration': ('durationSeconds', {int_or_none}),
'channel': 'clipSourceName',
})),
**self._extract_episode_info(page_data.get('title')),
'_old_archive_ids': [make_archive_id(ARDBetaMediathekIE, old_id)],
}
class ARDMediathekCollectionIE(InfoExtractor):
_VALID_URL = r'''(?x)https://
(?:(?:beta|www)\.)?ardmediathek\.de/
(?:[^/?#]+/)?
(?P<playlist>sendung|serie|sammlung)/
(?:(?P<display_id>[^?#]+?)/)?
(?P<id>[a-zA-Z0-9]+)
(?:/(?P<season>\d+)(?:/(?P<version>OV|AD))?)?/?(?:[?#]|$)'''
_GEO_COUNTRIES = ['DE']
_TESTS = [{
'url': 'https://www.ardmediathek.de/serie/quiz/staffel-1-originalversion/Y3JpZDovL3dkci5kZS9vbmUvcXVpeg/1/OV',
'info_dict': {
'id': 'Y3JpZDovL3dkci5kZS9vbmUvcXVpeg_1_OV',
'display_id': 'quiz/staffel-1-originalversion',
'title': 'Staffel 1 Originalversion',
},
'playlist_count': 3,
}, {
'url': 'https://www.ardmediathek.de/serie/babylon-berlin/staffel-4-mit-audiodeskription/Y3JpZDovL2Rhc2Vyc3RlLmRlL2JhYnlsb24tYmVybGlu/4/AD',
'info_dict': {
'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL2JhYnlsb24tYmVybGlu_4_AD',
'display_id': 'babylon-berlin/staffel-4-mit-audiodeskription',
'title': 'Staffel 4 mit Audiodeskription',
},
'playlist_count': 12,
}, {
'url': 'https://www.ardmediathek.de/serie/babylon-berlin/staffel-1/Y3JpZDovL2Rhc2Vyc3RlLmRlL2JhYnlsb24tYmVybGlu/1/',
'info_dict': {
'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL2JhYnlsb24tYmVybGlu_1',
'display_id': 'babylon-berlin/staffel-1',
'title': 'Staffel 1',
},
'playlist_count': 8,
}, {
'url': 'https://www.ardmediathek.de/sendung/tatort/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydA',
'info_dict': {
'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydA',
'display_id': 'tatort',
'title': 'Tatort',
},
'playlist_mincount': 500,
}, {
'url': 'https://www.ardmediathek.de/sammlung/die-kirche-bleibt-im-dorf/5eOHzt8XB2sqeFXbIoJlg2',
'info_dict': {
'id': '5eOHzt8XB2sqeFXbIoJlg2',
'display_id': 'die-kirche-bleibt-im-dorf',
'title': 'Die Kirche bleibt im Dorf',
'description': 'Die Kirche bleibt im Dorf',
},
'playlist_count': 4,
}, { }, {
# playlist of type 'sendung' # playlist of type 'sendung'
'url': 'https://www.ardmediathek.de/ard/sendung/doctor-who/Y3JpZDovL3dkci5kZS9vbmUvZG9jdG9yIHdobw/', 'url': 'https://www.ardmediathek.de/ard/sendung/doctor-who/Y3JpZDovL3dkci5kZS9vbmUvZG9jdG9yIHdobw/',
'only_matching': True, 'only_matching': True,
}, {
# playlist of type 'serie'
'url': 'https://www.ardmediathek.de/serie/nachtstreife/staffel-1/Y3JpZDovL3N3ci5kZS9zZGIvc3RJZC8xMjQy/1',
'only_matching': True,
}, { }, {
# playlist of type 'sammlung' # playlist of type 'sammlung'
'url': 'https://www.ardmediathek.de/ard/sammlung/team-muenster/5JpTzLSbWUAK8184IOvEir/', 'url': 'https://www.ardmediathek.de/ard/sammlung/team-muenster/5JpTzLSbWUAK8184IOvEir/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.ardmediathek.de/video/coronavirus-update-ndr-info/astrazeneca-kurz-lockdown-und-pims-syndrom-81/ndr/Y3JpZDovL25kci5kZS84NzE0M2FjNi0wMWEwLTQ5ODEtOTE5NS1mOGZhNzdhOTFmOTI/',
'only_matching': True,
}, {
'url': 'https://www.ardmediathek.de/ard/player/Y3JpZDovL3dkci5kZS9CZWl0cmFnLWQ2NDJjYWEzLTMwZWYtNGI4NS1iMTI2LTU1N2UxYTcxOGIzOQ/tatort-duo-koeln-leipzig-ihr-kinderlein-kommet',
'only_matching': True,
}] }]
def _ARD_load_playlist_snipped(self, playlist_id, display_id, client, mode, pageNumber): _PAGE_SIZE = 100
""" Query the ARD server for playlist information
and returns the data in "raw" format """
if mode == 'sendung':
graphQL = json.dumps({
'query': '''{
showPage(
client: "%s"
showId: "%s"
pageNumber: %d
) {
pagination {
pageSize
totalElements
}
teasers { # Array
mediumTitle
links { target { id href title } }
type
}
}}''' % (client, playlist_id, pageNumber),
}).encode()
else: # mode == 'sammlung'
graphQL = json.dumps({
'query': '''{
morePage(
client: "%s"
compilationId: "%s"
pageNumber: %d
) {
widget {
pagination {
pageSize
totalElements
}
teasers { # Array
mediumTitle
links { target { id href title } }
type
}
}
}}''' % (client, playlist_id, pageNumber),
}).encode()
# Ressources for ARD graphQL debugging:
# https://api-test.ardmediathek.de/public-gateway
show_page = self._download_json(
'https://api.ardmediathek.de/public-gateway',
'[Playlist] %s' % display_id,
data=graphQL,
headers={'Content-Type': 'application/json'})['data']
# align the structure of the returned data:
if mode == 'sendung':
show_page = show_page['showPage']
else: # mode == 'sammlung'
show_page = show_page['morePage']['widget']
return show_page
def _ARD_extract_playlist(self, url, playlist_id, display_id, client, mode):
""" Collects all playlist entries and returns them as info dict.
Supports playlists of mode 'sendung' and 'sammlung', and also nested
playlists. """
entries = []
pageNumber = 0
while True: # iterate by pageNumber
show_page = self._ARD_load_playlist_snipped(
playlist_id, display_id, client, mode, pageNumber)
for teaser in show_page['teasers']: # process playlist items
if '/compilation/' in teaser['links']['target']['href']:
# alternativ cond.: teaser['type'] == "compilation"
# => This is an nested compilation, e.g. like:
# https://www.ardmediathek.de/ard/sammlung/die-kirche-bleibt-im-dorf/5eOHzt8XB2sqeFXbIoJlg2/
link_mode = 'sammlung'
else:
link_mode = 'video'
item_url = 'https://www.ardmediathek.de/%s/%s/%s/%s/%s' % (
client, link_mode, display_id,
# perform HTLM quoting of episode title similar to ARD:
re.sub('^-|-$', '', # remove '-' from begin/end
re.sub('[^a-zA-Z0-9]+', '-', # replace special chars by -
teaser['links']['target']['title'].lower()
.replace('ä', 'ae').replace('ö', 'oe')
.replace('ü', 'ue').replace('ß', 'ss'))),
teaser['links']['target']['id'])
entries.append(self.url_result(
item_url,
ie=ARDBetaMediathekIE.ie_key()))
if (show_page['pagination']['pageSize'] * (pageNumber + 1)
>= show_page['pagination']['totalElements']):
# we've processed enough pages to get all playlist entries
break
pageNumber = pageNumber + 1
return self.playlist_result(entries, playlist_id, playlist_title=display_id)
def _real_extract(self, url): def _real_extract(self, url):
video_id, display_id, playlist_type, client, season_number = self._match_valid_url(url).group( playlist_id, display_id, playlist_type, season_number, version = self._match_valid_url(url).group(
'id', 'display_id', 'playlist', 'client', 'season') 'id', 'display_id', 'playlist', 'season', 'version')
display_id, client = display_id or video_id, client or 'ard'
if playlist_type: def call_api(page_num):
# TODO: Extract only specified season api_path = 'compilations/ard' if playlist_type == 'sammlung' else 'widgets/ard/asset'
return self._ARD_extract_playlist(url, video_id, display_id, client, playlist_type) return self._download_json(
f'https://api.ardmediathek.de/page-gateway/{api_path}/{playlist_id}', playlist_id,
f'Downloading playlist page {page_num}', query={
'pageNumber': page_num,
'pageSize': self._PAGE_SIZE,
**({
'seasoned': 'true',
'seasonNumber': season_number,
'withOriginalversion': 'true' if version == 'OV' else 'false',
'withAudiodescription': 'true' if version == 'AD' else 'false',
} if season_number else {}),
})
player_page = self._download_json( def fetch_page(page_num):
'https://api.ardmediathek.de/public-gateway', for item in traverse_obj(call_api(page_num), ('teasers', ..., {dict})):
display_id, data=json.dumps({ item_id = traverse_obj(item, ('links', 'target', ('urlId', 'id')), 'id', get_all=False)
'query': '''{ if not item_id or item_id == playlist_id:
playerPage(client:"%s", clipId: "%s") { continue
blockedByFsk item_mode = 'sammlung' if item.get('type') == 'compilation' else 'video'
broadcastedOn yield self.url_result(
maturityContentRating f'https://www.ardmediathek.de/{item_mode}/{item_id}',
mediaCollection { ie=(ARDMediathekCollectionIE if item_mode == 'sammlung' else ARDBetaMediathekIE),
_duration **traverse_obj(item, {
_geoblocked 'id': ('id', {str}),
_isLive 'title': ('longTitle', {str}),
_mediaArray { 'duration': ('duration', {int_or_none}),
_mediaStreamArray { 'timestamp': ('broadcastedOn', {parse_iso8601}),
_quality }))
_server
_stream
}
}
_previewImage
_subtitleUrl
_type
}
show {
title
}
image {
src
}
synopsis
title
tracking {
atiCustomVars {
contentId
}
}
}
}''' % (client, video_id),
}).encode(), headers={
'Content-Type': 'application/json'
})['data']['playerPage']
title = player_page['title']
content_id = str_or_none(try_get(
player_page, lambda x: x['tracking']['atiCustomVars']['contentId']))
media_collection = player_page.get('mediaCollection') or {}
if not media_collection and content_id:
media_collection = self._download_json(
'https://www.ardmediathek.de/play/media/' + content_id,
content_id, fatal=False) or {}
info = self._parse_media_info(
media_collection, content_id or video_id,
player_page.get('blockedByFsk'))
age_limit = None
description = player_page.get('synopsis')
maturity_content_rating = player_page.get('maturityContentRating')
if maturity_content_rating:
age_limit = int_or_none(maturity_content_rating.lstrip('FSK'))
if not age_limit and description:
age_limit = int_or_none(self._search_regex(
r'\(FSK\s*(\d+)\)\s*$', description, 'age limit', default=None))
info.update({
'age_limit': age_limit,
'display_id': display_id,
'title': title,
'description': description,
'timestamp': unified_timestamp(player_page.get('broadcastedOn')),
'series': try_get(player_page, lambda x: x['show']['title']),
'thumbnail': (media_collection.get('_previewImage')
or try_get(player_page, lambda x: update_url(x['image']['src'], query=None, fragment=None))
or self.get_thumbnail_from_html(display_id, url)),
})
info.update(self._ARD_extract_episode_info(info['title']))
return info
def get_thumbnail_from_html(self, display_id, url): page_data = call_api(0)
webpage = self._download_webpage(url, display_id, fatal=False) or '' full_id = join_nonempty(playlist_id, season_number, version, delim='_')
return (
self._og_search_thumbnail(webpage, default=None) return self.playlist_result(
or self._html_search_meta('thumbnailUrl', webpage, default=None)) OnDemandPagedList(fetch_page, self._PAGE_SIZE), full_id, display_id=display_id,
title=page_data.get('title'), description=page_data.get('synopsis'))

View File

@@ -1,53 +0,0 @@
from .common import InfoExtractor
from ..utils import unified_strdate
class ATTTechChannelIE(InfoExtractor):
_VALID_URL = r'https?://techchannel\.att\.com/play-video\.cfm/([^/]+/)*(?P<id>.+)'
_TEST = {
'url': 'http://techchannel.att.com/play-video.cfm/2014/1/27/ATT-Archives-The-UNIX-System-Making-Computers-Easier-to-Use',
'info_dict': {
'id': '11316',
'display_id': 'ATT-Archives-The-UNIX-System-Making-Computers-Easier-to-Use',
'ext': 'flv',
'title': 'AT&T Archives : The UNIX System: Making Computers Easier to Use',
'description': 'A 1982 film about UNIX is the foundation for software in use around Bell Labs and AT&T.',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140127',
},
'params': {
# rtmp download
'skip_download': True,
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_url = self._search_regex(
r"url\s*:\s*'(rtmp://[^']+)'",
webpage, 'video URL')
video_id = self._search_regex(
r'mediaid\s*=\s*(\d+)',
webpage, 'video id', fatal=False)
title = self._og_search_title(webpage)
description = self._og_search_description(webpage)
thumbnail = self._og_search_thumbnail(webpage)
upload_date = unified_strdate(self._search_regex(
r'[Rr]elease\s+date:\s*(\d{1,2}/\d{1,2}/\d{4})',
webpage, 'upload date', fatal=False), False)
return {
'id': video_id,
'display_id': display_id,
'url': video_url,
'ext': 'flv',
'title': title,
'description': description,
'thumbnail': thumbnail,
'upload_date': upload_date,
}

View File

@@ -152,7 +152,7 @@ class BanByeChannelIE(BanByeBaseIE):
'sort': 'new', 'sort': 'new',
'limit': self._PAGE_SIZE, 'limit': self._PAGE_SIZE,
'offset': page_num * self._PAGE_SIZE, 'offset': page_num * self._PAGE_SIZE,
}, note=f'Downloading page {page_num+1}') }, note=f'Downloading page {page_num + 1}')
return [ return [
self.url_result(f"{self._VIDEO_BASE}/{video['_id']}", BanByeIE) self.url_result(f"{self._VIDEO_BASE}/{video['_id']}", BanByeIE)
for video in data['items'] for video in data['items']

View File

@@ -317,16 +317,25 @@ class BBCCoUkIE(InfoExtractor):
def _download_media_selector(self, programme_id): def _download_media_selector(self, programme_id):
last_exception = None last_exception = None
formats, subtitles = [], {}
for media_set in self._MEDIA_SETS: for media_set in self._MEDIA_SETS:
try: try:
return self._download_media_selector_url( fmts, subs = self._download_media_selector_url(
self._MEDIA_SELECTOR_URL_TEMPL % (media_set, programme_id), programme_id) self._MEDIA_SELECTOR_URL_TEMPL % (media_set, programme_id), programme_id)
formats.extend(fmts)
if subs:
self._merge_subtitles(subs, target=subtitles)
except BBCCoUkIE.MediaSelectionError as e: except BBCCoUkIE.MediaSelectionError as e:
if e.id in ('notukerror', 'geolocation', 'selectionunavailable'): if e.id in ('notukerror', 'geolocation', 'selectionunavailable'):
last_exception = e last_exception = e
continue continue
self._raise_extractor_error(e) self._raise_extractor_error(e)
self._raise_extractor_error(last_exception) if last_exception:
if formats or subtitles:
self.report_warning(f'{self.IE_NAME} returned error: {last_exception.id}')
else:
self._raise_extractor_error(last_exception)
return formats, subtitles
def _download_media_selector_url(self, url, programme_id=None): def _download_media_selector_url(self, url, programme_id=None):
media_selection = self._download_json( media_selection = self._download_json(
@@ -1188,7 +1197,7 @@ class BBCIE(BBCCoUkIE): # XXX: Do not subclass from concrete IE
if initial_data is None: if initial_data is None:
initial_data = self._search_regex( initial_data = self._search_regex(
r'window\.__INITIAL_DATA__\s*=\s*({.+?})\s*;', webpage, r'window\.__INITIAL_DATA__\s*=\s*({.+?})\s*;', webpage,
'preload state', default={}) 'preload state', default='{}')
else: else:
initial_data = self._parse_json(initial_data or '"{}"', playlist_id, fatal=False) initial_data = self._parse_json(initial_data or '"{}"', playlist_id, fatal=False)
initial_data = self._parse_json(initial_data, playlist_id, fatal=False) initial_data = self._parse_json(initial_data, playlist_id, fatal=False)

View File

@@ -3,6 +3,7 @@ from ..utils import url_basename
class BehindKinkIE(InfoExtractor): class BehindKinkIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)' _VALID_URL = r'https?://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
_TEST = { _TEST = {
'url': 'http://www.behindkink.com/2014/12/05/what-are-you-passionate-about-marley-blaze/', 'url': 'http://www.behindkink.com/2014/12/05/what-are-you-passionate-about-marley-blaze/',

View File

@@ -1,10 +1,9 @@
from .mtv import MTVServicesInfoExtractor from .mtv import MTVServicesInfoExtractor
from ..utils import unified_strdate from ..utils import unified_strdate
# TODO Remove - Reason: Outdated Site
class BetIE(MTVServicesInfoExtractor): class BetIE(MTVServicesInfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html' _VALID_URL = r'https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html'
_TESTS = [ _TESTS = [
{ {

View File

@@ -5,6 +5,7 @@ from ..utils import extract_attributes
class BFIPlayerIE(InfoExtractor): class BFIPlayerIE(InfoExtractor):
_WORKING = False
IE_NAME = 'bfi:player' IE_NAME = 'bfi:player'
_VALID_URL = r'https?://player\.bfi\.org\.uk/[^/]+/film/watch-(?P<id>[\w-]+)-online' _VALID_URL = r'https?://player\.bfi\.org\.uk/[^/]+/film/watch-(?P<id>[\w-]+)-online'
_TEST = { _TEST = {

View File

@@ -7,7 +7,7 @@ from ..utils import extract_attributes
class BFMTVBaseIE(InfoExtractor): class BFMTVBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.|rmc\.)?bfmtv\.com/' _VALID_URL_BASE = r'https?://(?:www\.|rmc\.)?bfmtv\.com/'
_VALID_URL_TMPL = _VALID_URL_BASE + r'(?:[^/]+/)*[^/?&#]+_%s[A-Z]-(?P<id>\d{12})\.html' _VALID_URL_TMPL = _VALID_URL_BASE + r'(?:[^/]+/)*[^/?&#]+_%s[A-Z]-(?P<id>\d{12})\.html'
_VIDEO_BLOCK_REGEX = r'(<div[^>]+class="video_block"[^>]*>)' _VIDEO_BLOCK_REGEX = r'(<div[^>]+class="video_block[^"]*"[^>]*>)'
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_default/index.html?videoId=%s' BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_default/index.html?videoId=%s'
def _brightcove_url_result(self, video_id, video_block): def _brightcove_url_result(self, video_id, video_block):
@@ -55,8 +55,11 @@ class BFMTVLiveIE(BFMTVIE): # XXX: Do not subclass from concrete IE
'ext': 'mp4', 'ext': 'mp4',
'title': r're:^le direct BFMTV WEB \d{4}-\d{2}-\d{2} \d{2}:\d{2}$', 'title': r're:^le direct BFMTV WEB \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'uploader_id': '876450610001', 'uploader_id': '876450610001',
'upload_date': '20171018', 'upload_date': '20220926',
'timestamp': 1508329950, 'timestamp': 1664207191,
'live_status': 'is_live',
'thumbnail': r're:https://.+/image\.jpg',
'tags': [],
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,

View File

@@ -2,6 +2,7 @@ import base64
import functools import functools
import hashlib import hashlib
import itertools import itertools
import json
import math import math
import re import re
import time import time
@@ -16,9 +17,11 @@ from ..utils import (
InAdvancePagedList, InAdvancePagedList,
OnDemandPagedList, OnDemandPagedList,
bool_or_none, bool_or_none,
clean_html,
filter_dict, filter_dict,
float_or_none, float_or_none,
format_field, format_field,
get_element_by_class,
int_or_none, int_or_none,
join_nonempty, join_nonempty,
make_archive_id, make_archive_id,
@@ -88,6 +91,12 @@ class BilibiliBaseIE(InfoExtractor):
return formats return formats
def _download_playinfo(self, video_id, cid):
return self._download_json(
'https://api.bilibili.com/x/player/playurl', video_id,
query={'bvid': video_id, 'cid': cid, 'fnval': 4048},
note=f'Downloading video formats for cid {cid}')['data']
def json2srt(self, json_data): def json2srt(self, json_data):
srt_data = '' srt_data = ''
for idx, line in enumerate(json_data.get('body') or []): for idx, line in enumerate(json_data.get('body') or []):
@@ -96,7 +105,7 @@ class BilibiliBaseIE(InfoExtractor):
f'{line["content"]}\n\n') f'{line["content"]}\n\n')
return srt_data return srt_data
def _get_subtitles(self, video_id, aid, cid): def _get_subtitles(self, video_id, cid, aid=None):
subtitles = { subtitles = {
'danmaku': [{ 'danmaku': [{
'ext': 'xml', 'ext': 'xml',
@@ -104,8 +113,15 @@ class BilibiliBaseIE(InfoExtractor):
}] }]
} }
video_info_json = self._download_json(f'https://api.bilibili.com/x/player/v2?aid={aid}&cid={cid}', video_id) subtitle_info = traverse_obj(self._download_json(
for s in traverse_obj(video_info_json, ('data', 'subtitle', 'subtitles', ...)): 'https://api.bilibili.com/x/player/v2', video_id,
query={'aid': aid, 'cid': cid} if aid else {'bvid': video_id, 'cid': cid},
note=f'Extracting subtitle info {cid}'), ('data', 'subtitle'))
subs_list = traverse_obj(subtitle_info, ('subtitles', lambda _, v: v['subtitle_url'] and v['lan']))
if not subs_list and traverse_obj(subtitle_info, 'allow_submit'):
if not self._get_cookies('https://api.bilibili.com').get('SESSDATA'): # no login session cookie
self.report_warning(f'CC subtitles (if any) are only visible when logged in. {self._login_hint()}', only_once=True)
for s in subs_list:
subtitles.setdefault(s['lan'], []).append({ subtitles.setdefault(s['lan'], []).append({
'ext': 'srt', 'ext': 'srt',
'data': self.json2srt(self._download_json(s['subtitle_url'], video_id)) 'data': self.json2srt(self._download_json(s['subtitle_url'], video_id))
@@ -155,7 +171,54 @@ class BilibiliBaseIE(InfoExtractor):
for entry in traverse_obj(season_info, ( for entry in traverse_obj(season_info, (
'result', 'main_section', 'episodes', 'result', 'main_section', 'episodes',
lambda _, v: url_or_none(v['share_url']) and v['id'])): lambda _, v: url_or_none(v['share_url']) and v['id'])):
yield self.url_result(entry['share_url'], BiliBiliBangumiIE, f'ep{entry["id"]}') yield self.url_result(entry['share_url'], BiliBiliBangumiIE, str_or_none(entry.get('id')))
def _get_divisions(self, video_id, graph_version, edges, edge_id, cid_edges=None):
cid_edges = cid_edges or {}
division_data = self._download_json(
'https://api.bilibili.com/x/stein/edgeinfo_v2', video_id,
query={'graph_version': graph_version, 'edge_id': edge_id, 'bvid': video_id},
note=f'Extracting divisions from edge {edge_id}')
edges.setdefault(edge_id, {}).update(
traverse_obj(division_data, ('data', 'story_list', lambda _, v: v['edge_id'] == edge_id, {
'title': ('title', {str}),
'cid': ('cid', {int_or_none}),
}), get_all=False))
edges[edge_id].update(traverse_obj(division_data, ('data', {
'title': ('title', {str}),
'choices': ('edges', 'questions', ..., 'choices', ..., {
'edge_id': ('id', {int_or_none}),
'cid': ('cid', {int_or_none}),
'text': ('option', {str}),
}),
})))
# use dict to combine edges that use the same video section (same cid)
cid_edges.setdefault(edges[edge_id]['cid'], {})[edge_id] = edges[edge_id]
for choice in traverse_obj(edges, (edge_id, 'choices', ...)):
if choice['edge_id'] not in edges:
edges[choice['edge_id']] = {'cid': choice['cid']}
self._get_divisions(video_id, graph_version, edges, choice['edge_id'], cid_edges=cid_edges)
return cid_edges
def _get_interactive_entries(self, video_id, cid, metainfo):
graph_version = traverse_obj(
self._download_json(
'https://api.bilibili.com/x/player/wbi/v2', video_id,
'Extracting graph version', query={'bvid': video_id, 'cid': cid}),
('data', 'interaction', 'graph_version', {int_or_none}))
cid_edges = self._get_divisions(video_id, graph_version, {1: {'cid': cid}}, 1)
for cid, edges in cid_edges.items():
play_info = self._download_playinfo(video_id, cid)
yield {
**metainfo,
'id': f'{video_id}_{cid}',
'title': f'{metainfo.get("title")} - {list(edges.values())[0].get("title")}',
'formats': self.extract_formats(play_info),
'description': f'{json.dumps(edges, ensure_ascii=False)}\n{metainfo.get("description", "")}',
'duration': float_or_none(play_info.get('timelength'), scale=1000),
'subtitles': self.extract_subtitles(video_id, cid),
}
class BiliBiliIE(BilibiliBaseIE): class BiliBiliIE(BilibiliBaseIE):
@@ -180,7 +243,7 @@ class BiliBiliIE(BilibiliBaseIE):
'view_count': int, 'view_count': int,
}, },
}, { }, {
# old av URL version 'note': 'old av URL version',
'url': 'http://www.bilibili.com/video/av1074402/', 'url': 'http://www.bilibili.com/video/av1074402/',
'info_dict': { 'info_dict': {
'thumbnail': r're:^https?://.*\.(jpg|jpeg)$', 'thumbnail': r're:^https?://.*\.(jpg|jpeg)$',
@@ -212,7 +275,7 @@ class BiliBiliIE(BilibiliBaseIE):
'id': 'BV1bK411W797_p1', 'id': 'BV1bK411W797_p1',
'ext': 'mp4', 'ext': 'mp4',
'title': '物语中的人物是如何吐槽自己的OP的 p01 Staple Stable/战场原+羽川', 'title': '物语中的人物是如何吐槽自己的OP的 p01 Staple Stable/战场原+羽川',
'tags': 'count:11', 'tags': 'count:10',
'timestamp': 1589601697, 'timestamp': 1589601697,
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$', 'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
'uploader': '打牌还是打桩', 'uploader': '打牌还是打桩',
@@ -232,7 +295,7 @@ class BiliBiliIE(BilibiliBaseIE):
'id': 'BV1bK411W797_p1', 'id': 'BV1bK411W797_p1',
'ext': 'mp4', 'ext': 'mp4',
'title': '物语中的人物是如何吐槽自己的OP的 p01 Staple Stable/战场原+羽川', 'title': '物语中的人物是如何吐槽自己的OP的 p01 Staple Stable/战场原+羽川',
'tags': 'count:11', 'tags': 'count:10',
'timestamp': 1589601697, 'timestamp': 1589601697,
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$', 'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
'uploader': '打牌还是打桩', 'uploader': '打牌还是打桩',
@@ -343,18 +406,120 @@ class BiliBiliIE(BilibiliBaseIE):
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$', 'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
}, },
'params': {'skip_download': True}, 'params': {'skip_download': True},
}, {
'note': 'interactive/split-path video',
'url': 'https://www.bilibili.com/video/BV1af4y1H7ga/',
'info_dict': {
'id': 'BV1af4y1H7ga',
'title': '【互动游戏】花了大半年时间做的自我介绍~请查收!!',
'timestamp': 1630500414,
'upload_date': '20210901',
'description': 'md5:01113e39ab06e28042d74ac356a08786',
'tags': list,
'uploader': '钉宫妮妮Ninico',
'duration': 1503,
'uploader_id': '8881297',
'comment_count': int,
'view_count': int,
'like_count': int,
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
},
'playlist_count': 33,
'playlist': [{
'info_dict': {
'id': 'BV1af4y1H7ga_400950101',
'ext': 'mp4',
'title': '【互动游戏】花了大半年时间做的自我介绍~请查收!! - 听见猫猫叫~',
'timestamp': 1630500414,
'upload_date': '20210901',
'description': 'md5:db66ac7a2813a94b8291dbce990cc5b2',
'tags': list,
'uploader': '钉宫妮妮Ninico',
'duration': 11.605,
'uploader_id': '8881297',
'comment_count': int,
'view_count': int,
'like_count': int,
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
},
}],
}, {
'note': '301 redirect to bangumi link',
'url': 'https://www.bilibili.com/video/BV1TE411f7f1',
'info_dict': {
'id': '288525',
'title': '李永乐老师 钱学森弹道和乘波体飞行器是什么?',
'ext': 'mp4',
'series': '我和我的祖国',
'series_id': '4780',
'season': '幕后纪实',
'season_id': '28609',
'season_number': 1,
'episode': '钱学森弹道和乘波体飞行器是什么?',
'episode_id': '288525',
'episode_number': 105,
'duration': 1183.957,
'timestamp': 1571648124,
'upload_date': '20191021',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
},
}, {
'url': 'https://www.bilibili.com/video/BV1jL41167ZG/',
'info_dict': {
'id': 'BV1jL41167ZG',
'title': '一场大火引发的离奇死亡!古典推理经典短篇集《不可能犯罪诊断书》!',
'ext': 'mp4',
},
'skip': 'supporter-only video',
}, {
'url': 'https://www.bilibili.com/video/BV1Ks411f7aQ/',
'info_dict': {
'id': 'BV1Ks411f7aQ',
'title': '【BD1080P】狼与香辛料I【华盟】',
'ext': 'mp4',
},
'skip': 'login required',
}, {
'url': 'https://www.bilibili.com/video/BV1GJ411x7h7/',
'info_dict': {
'id': 'BV1GJ411x7h7',
'title': '【官方 MV】Never Gonna Give You Up - Rick Astley',
'ext': 'mp4',
},
'skip': 'geo-restricted',
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage, urlh = self._download_webpage_handle(url, video_id)
if not self._match_valid_url(urlh.url):
return self.url_result(urlh.url)
initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id) initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id)
is_festival = 'videoData' not in initial_state is_festival = 'videoData' not in initial_state
if is_festival: if is_festival:
video_data = initial_state['videoInfo'] video_data = initial_state['videoInfo']
else: else:
play_info = self._search_json(r'window\.__playinfo__\s*=', webpage, 'play info', video_id)['data'] play_info_obj = self._search_json(
r'window\.__playinfo__\s*=', webpage, 'play info', video_id, fatal=False)
if not play_info_obj:
if traverse_obj(initial_state, ('error', 'trueCode')) == -403:
self.raise_login_required()
if traverse_obj(initial_state, ('error', 'trueCode')) == -404:
raise ExtractorError(
'This video may be deleted or geo-restricted. '
'You might want to try a VPN or a proxy server (with --proxy)', expected=True)
play_info = traverse_obj(play_info_obj, ('data', {dict}))
if not play_info:
if traverse_obj(play_info_obj, 'code') == 87007:
toast = get_element_by_class('tips-toast', webpage) or ''
msg = clean_html(
f'{get_element_by_class("belongs-to", toast) or ""}'
+ (get_element_by_class('level', toast) or ''))
raise ExtractorError(
f'This is a supporter-only video: {msg}. {self._login_hint()}', expected=True)
raise ExtractorError('Failed to extract play info')
video_data = initial_state['videoData'] video_data = initial_state['videoData']
video_id, title = video_data['bvid'], video_data.get('title') video_id, title = video_data['bvid'], video_data.get('title')
@@ -385,10 +550,7 @@ class BiliBiliIE(BilibiliBaseIE):
festival_info = {} festival_info = {}
if is_festival: if is_festival:
play_info = self._download_json( play_info = self._download_playinfo(video_id, cid)
'https://api.bilibili.com/x/player/playurl', video_id,
query={'bvid': video_id, 'cid': cid, 'fnval': 4048},
note='Extracting festival video formats')['data']
festival_info = traverse_obj(initial_state, { festival_info = traverse_obj(initial_state, {
'uploader': ('videoInfo', 'upName'), 'uploader': ('videoInfo', 'upName'),
@@ -397,7 +559,7 @@ class BiliBiliIE(BilibiliBaseIE):
'thumbnail': ('sectionEpisodes', lambda _, v: v['bvid'] == video_id, 'cover'), 'thumbnail': ('sectionEpisodes', lambda _, v: v['bvid'] == video_id, 'cover'),
}, get_all=False) }, get_all=False)
return { metainfo = {
**traverse_obj(initial_state, { **traverse_obj(initial_state, {
'uploader': ('upData', 'name'), 'uploader': ('upData', 'name'),
'uploader_id': ('upData', 'mid', {str_or_none}), 'uploader_id': ('upData', 'mid', {str_or_none}),
@@ -413,28 +575,59 @@ class BiliBiliIE(BilibiliBaseIE):
'comment_count': ('stat', 'reply', {int_or_none}), 'comment_count': ('stat', 'reply', {int_or_none}),
}, get_all=False), }, get_all=False),
'id': f'{video_id}{format_field(part_id, None, "_p%d")}', 'id': f'{video_id}{format_field(part_id, None, "_p%d")}',
'formats': self.extract_formats(play_info),
'_old_archive_ids': [make_archive_id(self, old_video_id)] if old_video_id else None, '_old_archive_ids': [make_archive_id(self, old_video_id)] if old_video_id else None,
'title': title, 'title': title,
'duration': float_or_none(play_info.get('timelength'), scale=1000),
'chapters': self._get_chapters(aid, cid),
'subtitles': self.extract_subtitles(video_id, aid, cid),
'__post_extractor': self.extract_comments(aid),
'http_headers': {'Referer': url}, 'http_headers': {'Referer': url},
} }
is_interactive = traverse_obj(video_data, ('rights', 'is_stein_gate'))
if is_interactive:
return self.playlist_result(
self._get_interactive_entries(video_id, cid, metainfo), **metainfo, **{
'duration': traverse_obj(initial_state, ('videoData', 'duration', {int_or_none})),
'__post_extractor': self.extract_comments(aid),
})
else:
return {
**metainfo,
'duration': float_or_none(play_info.get('timelength'), scale=1000),
'chapters': self._get_chapters(aid, cid),
'subtitles': self.extract_subtitles(video_id, cid),
'formats': self.extract_formats(play_info),
'__post_extractor': self.extract_comments(aid),
}
class BiliBiliBangumiIE(BilibiliBaseIE): class BiliBiliBangumiIE(BilibiliBaseIE):
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/bangumi/play/(?P<id>ep\d+)' _VALID_URL = r'https?://(?:www\.)?bilibili\.com/bangumi/play/ep(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.bilibili.com/bangumi/play/ep21495/',
'info_dict': {
'id': '21495',
'ext': 'mp4',
'series': '悠久之翼',
'series_id': '774',
'season': '第二季',
'season_id': '1182',
'season_number': 2,
'episode': 'foreveref',
'episode_id': '21495',
'episode_number': 12,
'title': '12 foreveref',
'duration': 1420.791,
'timestamp': 1320412200,
'upload_date': '20111104',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
},
}, {
'url': 'https://www.bilibili.com/bangumi/play/ep267851', 'url': 'https://www.bilibili.com/bangumi/play/ep267851',
'info_dict': { 'info_dict': {
'id': '267851', 'id': '267851',
'ext': 'mp4', 'ext': 'mp4',
'series': '鬼灭之刃', 'series': '鬼灭之刃',
'series_id': '4358', 'series_id': '4358',
'season': '鬼灭之刃', 'season': '立志篇',
'season_id': '26801', 'season_id': '26801',
'season_number': 1, 'season_number': 1,
'episode': '残酷', 'episode': '残酷',
@@ -446,13 +639,32 @@ class BiliBiliBangumiIE(BilibiliBaseIE):
'upload_date': '20190406', 'upload_date': '20190406',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$' 'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$'
}, },
'skip': 'According to the copyright owner\'s request, you may only watch the video after you are premium member.' 'skip': 'Geo-restricted',
}, {
'note': 'a making-of which falls outside main section',
'url': 'https://www.bilibili.com/bangumi/play/ep345120',
'info_dict': {
'id': '345120',
'ext': 'mp4',
'series': '鬼灭之刃',
'series_id': '4358',
'season': '立志篇',
'season_id': '26801',
'season_number': 1,
'episode': '炭治郎篇',
'episode_id': '345120',
'episode_number': 27,
'title': '#1 炭治郎篇',
'duration': 1922.129,
'timestamp': 1602853860,
'upload_date': '20201016',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$'
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) episode_id = self._match_id(url)
episode_id = video_id[2:] webpage = self._download_webpage(url, episode_id)
webpage = self._download_webpage(url, video_id)
if '您所在的地区无法观看本片' in webpage: if '您所在的地区无法观看本片' in webpage:
raise GeoRestrictedError('This video is restricted') raise GeoRestrictedError('This video is restricted')
@@ -461,7 +673,7 @@ class BiliBiliBangumiIE(BilibiliBaseIE):
headers = {'Referer': url, **self.geo_verification_headers()} headers = {'Referer': url, **self.geo_verification_headers()}
play_info = self._download_json( play_info = self._download_json(
'https://api.bilibili.com/pgc/player/web/v2/playurl', video_id, 'https://api.bilibili.com/pgc/player/web/v2/playurl', episode_id,
'Extracting episode', query={'fnval': '4048', 'ep_id': episode_id}, 'Extracting episode', query={'fnval': '4048', 'ep_id': episode_id},
headers=headers) headers=headers)
premium_only = play_info.get('code') == -10403 premium_only = play_info.get('code') == -10403
@@ -472,40 +684,43 @@ class BiliBiliBangumiIE(BilibiliBaseIE):
self.raise_login_required('This video is for premium members only') self.raise_login_required('This video is for premium members only')
bangumi_info = self._download_json( bangumi_info = self._download_json(
'https://api.bilibili.com/pgc/view/web/season', video_id, 'Get episode details', 'https://api.bilibili.com/pgc/view/web/season', episode_id, 'Get episode details',
query={'ep_id': episode_id}, headers=headers)['result'] query={'ep_id': episode_id}, headers=headers)['result']
episode_number, episode_info = next(( episode_number, episode_info = next((
(idx, ep) for idx, ep in enumerate(traverse_obj( (idx, ep) for idx, ep in enumerate(traverse_obj(
bangumi_info, ('episodes', ..., {dict})), 1) bangumi_info, (('episodes', ('section', ..., 'episodes')), ..., {dict})), 1)
if str_or_none(ep.get('id')) == episode_id), (1, {})) if str_or_none(ep.get('id')) == episode_id), (1, {}))
season_id = bangumi_info.get('season_id') season_id = bangumi_info.get('season_id')
season_number = season_id and next(( season_number, season_title = season_id and next((
idx + 1 for idx, e in enumerate( (idx + 1, e.get('season_title')) for idx, e in enumerate(
traverse_obj(bangumi_info, ('seasons', ...))) traverse_obj(bangumi_info, ('seasons', ...)))
if e.get('season_id') == season_id if e.get('season_id') == season_id
), None) ), (None, None))
aid = episode_info.get('aid') aid = episode_info.get('aid')
return { return {
'id': video_id, 'id': episode_id,
'formats': formats, 'formats': formats,
**traverse_obj(bangumi_info, { **traverse_obj(bangumi_info, {
'series': ('series', 'series_title', {str}), 'series': ('series', 'series_title', {str}),
'series_id': ('series', 'series_id', {str_or_none}), 'series_id': ('series', 'series_id', {str_or_none}),
'thumbnail': ('square_cover', {url_or_none}), 'thumbnail': ('square_cover', {url_or_none}),
}), }),
'title': join_nonempty('title', 'long_title', delim=' ', from_dict=episode_info), **traverse_obj(episode_info, {
'episode': episode_info.get('long_title'), 'episode': ('long_title', {str}),
'episode_number': ('title', {int_or_none}, {lambda x: x or episode_number}),
'timestamp': ('pub_time', {int_or_none}),
'title': {lambda v: v and join_nonempty('title', 'long_title', delim=' ', from_dict=v)},
}),
'episode_id': episode_id, 'episode_id': episode_id,
'episode_number': int_or_none(episode_info.get('title')) or episode_number, 'season': str_or_none(season_title),
'season_id': str_or_none(season_id), 'season_id': str_or_none(season_id),
'season_number': season_number, 'season_number': season_number,
'timestamp': int_or_none(episode_info.get('pub_time')),
'duration': float_or_none(play_info.get('timelength'), scale=1000), 'duration': float_or_none(play_info.get('timelength'), scale=1000),
'subtitles': self.extract_subtitles(video_id, aid, episode_info.get('cid')), 'subtitles': self.extract_subtitles(episode_id, episode_info.get('cid'), aid=aid),
'__post_extractor': self.extract_comments(aid), '__post_extractor': self.extract_comments(aid),
'http_headers': headers, 'http_headers': headers,
} }
@@ -517,17 +732,53 @@ class BiliBiliBangumiMediaIE(BilibiliBaseIE):
'url': 'https://www.bilibili.com/bangumi/media/md24097891', 'url': 'https://www.bilibili.com/bangumi/media/md24097891',
'info_dict': { 'info_dict': {
'id': '24097891', 'id': '24097891',
'title': 'CAROLE & TUESDAY',
'description': 'md5:42417ad33d1eaa1c93bfd2dd1626b829',
}, },
'playlist_mincount': 25, 'playlist_mincount': 25,
}, {
'url': 'https://www.bilibili.com/bangumi/media/md1565/',
'info_dict': {
'id': '1565',
'title': '攻壳机动队 S.A.C. 2nd GIG',
'description': 'md5:46cac00bafd645b97f4d6df616fc576d',
},
'playlist_count': 26,
'playlist': [{
'info_dict': {
'id': '68540',
'ext': 'mp4',
'series': '攻壳机动队',
'series_id': '1077',
'season': '第二季',
'season_id': '1565',
'season_number': 2,
'episode': '再启动 REEMBODY',
'episode_id': '68540',
'episode_number': 1,
'title': '1 再启动 REEMBODY',
'duration': 1525.777,
'timestamp': 1425074413,
'upload_date': '20150227',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$'
},
}],
}] }]
def _real_extract(self, url): def _real_extract(self, url):
media_id = self._match_id(url) media_id = self._match_id(url)
webpage = self._download_webpage(url, media_id) webpage = self._download_webpage(url, media_id)
ss_id = self._search_json(
r'window\.__INITIAL_STATE__\s*=', webpage, 'initial_state', media_id)['mediaInfo']['season_id']
return self.playlist_result(self._get_episodes_from_season(ss_id, url), media_id) initial_state = self._search_json(
r'window\.__INITIAL_STATE__\s*=', webpage, 'initial_state', media_id)
ss_id = initial_state['mediaInfo']['season_id']
return self.playlist_result(
self._get_episodes_from_season(ss_id, url), media_id,
**traverse_obj(initial_state, ('mediaInfo', {
'title': ('title', {str}),
'description': ('evaluate', {str}),
})))
class BiliBiliBangumiSeasonIE(BilibiliBaseIE): class BiliBiliBangumiSeasonIE(BilibiliBaseIE):
@@ -535,15 +786,183 @@ class BiliBiliBangumiSeasonIE(BilibiliBaseIE):
_TESTS = [{ _TESTS = [{
'url': 'https://www.bilibili.com/bangumi/play/ss26801', 'url': 'https://www.bilibili.com/bangumi/play/ss26801',
'info_dict': { 'info_dict': {
'id': '26801' 'id': '26801',
'title': '鬼灭之刃',
'description': 'md5:e2cc9848b6f69be6db79fc2a82d9661b',
}, },
'playlist_mincount': 26 'playlist_mincount': 26
}, {
'url': 'https://www.bilibili.com/bangumi/play/ss2251',
'info_dict': {
'id': '2251',
'title': '玲音',
'description': 'md5:1fd40e3df4c08d4d9d89a6a34844bdc4',
},
'playlist_count': 13,
'playlist': [{
'info_dict': {
'id': '50188',
'ext': 'mp4',
'series': '玲音',
'series_id': '1526',
'season': 'TV',
'season_id': '2251',
'season_number': 1,
'episode': 'WEIRD',
'episode_id': '50188',
'episode_number': 1,
'title': '1 WEIRD',
'duration': 1436.992,
'timestamp': 1343185080,
'upload_date': '20120725',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$'
},
}],
}] }]
def _real_extract(self, url): def _real_extract(self, url):
ss_id = self._match_id(url) ss_id = self._match_id(url)
webpage = self._download_webpage(url, ss_id)
metainfo = traverse_obj(
self._search_json(r'<script[^>]+type="application/ld\+json"[^>]*>', webpage, 'info', ss_id),
('itemListElement', ..., {
'title': ('name', {str}),
'description': ('description', {str}),
}), get_all=False)
return self.playlist_result(self._get_episodes_from_season(ss_id, url), ss_id) return self.playlist_result(self._get_episodes_from_season(ss_id, url), ss_id, **metainfo)
class BilibiliCheeseBaseIE(BilibiliBaseIE):
_HEADERS = {'Referer': 'https://www.bilibili.com/'}
def _extract_episode(self, season_info, ep_id):
episode_info = traverse_obj(season_info, (
'episodes', lambda _, v: v['id'] == int(ep_id)), get_all=False)
aid, cid = episode_info['aid'], episode_info['cid']
if traverse_obj(episode_info, 'ep_status') == -1:
raise ExtractorError('This course episode is not yet available.', expected=True)
if not traverse_obj(episode_info, 'playable'):
self.raise_login_required('You need to purchase the course to download this episode')
play_info = self._download_json(
'https://api.bilibili.com/pugv/player/web/playurl', ep_id,
query={'avid': aid, 'cid': cid, 'ep_id': ep_id, 'fnval': 16, 'fourk': 1},
headers=self._HEADERS, note='Downloading playinfo')['data']
return {
'id': str_or_none(ep_id),
'episode_id': str_or_none(ep_id),
'formats': self.extract_formats(play_info),
'extractor_key': BilibiliCheeseIE.ie_key(),
'extractor': BilibiliCheeseIE.IE_NAME,
'webpage_url': f'https://www.bilibili.com/cheese/play/ep{ep_id}',
**traverse_obj(episode_info, {
'episode': ('title', {str}),
'title': {lambda v: v and join_nonempty('index', 'title', delim=' - ', from_dict=v)},
'alt_title': ('subtitle', {str}),
'duration': ('duration', {int_or_none}),
'episode_number': ('index', {int_or_none}),
'thumbnail': ('cover', {url_or_none}),
'timestamp': ('release_date', {int_or_none}),
'view_count': ('play', {int_or_none}),
}),
**traverse_obj(season_info, {
'uploader': ('up_info', 'uname', {str}),
'uploader_id': ('up_info', 'mid', {str_or_none}),
}),
'subtitles': self.extract_subtitles(ep_id, cid, aid=aid),
'__post_extractor': self.extract_comments(aid),
'http_headers': self._HEADERS,
}
def _download_season_info(self, query_key, video_id):
return self._download_json(
f'https://api.bilibili.com/pugv/view/web/season?{query_key}={video_id}', video_id,
headers=self._HEADERS, note='Downloading season info')['data']
class BilibiliCheeseIE(BilibiliCheeseBaseIE):
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/cheese/play/ep(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.bilibili.com/cheese/play/ep229832',
'info_dict': {
'id': '229832',
'ext': 'mp4',
'title': '1 - 课程先导片',
'alt_title': '视频课·3分41秒',
'uploader': '马督工',
'uploader_id': '316568752',
'episode': '课程先导片',
'episode_id': '229832',
'episode_number': 1,
'duration': 221,
'timestamp': 1695549606,
'upload_date': '20230924',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
'view_count': int,
}
}]
def _real_extract(self, url):
ep_id = self._match_id(url)
return self._extract_episode(self._download_season_info('ep_id', ep_id), ep_id)
class BilibiliCheeseSeasonIE(BilibiliCheeseBaseIE):
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/cheese/play/ss(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.bilibili.com/cheese/play/ss5918',
'info_dict': {
'id': '5918',
'title': '【限时五折】新闻系学不到:马督工教你做自媒体',
'description': '帮普通人建立世界模型,降低人与人的沟通门槛',
},
'playlist': [{
'info_dict': {
'id': '229832',
'ext': 'mp4',
'title': '1 - 课程先导片',
'alt_title': '视频课·3分41秒',
'uploader': '马督工',
'uploader_id': '316568752',
'episode': '课程先导片',
'episode_id': '229832',
'episode_number': 1,
'duration': 221,
'timestamp': 1695549606,
'upload_date': '20230924',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
'view_count': int,
}
}],
'params': {'playlist_items': '1'},
}, {
'url': 'https://www.bilibili.com/cheese/play/ss5918',
'info_dict': {
'id': '5918',
'title': '【限时五折】新闻系学不到:马督工教你做自媒体',
'description': '帮普通人建立世界模型,降低人与人的沟通门槛',
},
'playlist_mincount': 5,
'skip': 'paid video in list',
}]
def _get_cheese_entries(self, season_info):
for ep_id in traverse_obj(season_info, ('episodes', lambda _, v: v['episode_can_view'], 'id')):
yield self._extract_episode(season_info, ep_id)
def _real_extract(self, url):
season_id = self._match_id(url)
season_info = self._download_season_info('season_id', season_id)
return self.playlist_result(
self._get_cheese_entries(season_info), season_id,
**traverse_obj(season_info, {
'title': ('title', {str}),
'description': ('subtitle', {str}),
}))
class BilibiliSpaceBaseIE(InfoExtractor): class BilibiliSpaceBaseIE(InfoExtractor):

View File

@@ -1,110 +0,0 @@
from .common import InfoExtractor
from .vk import VKIE
from ..compat import compat_b64decode
from ..utils import (
int_or_none,
js_to_json,
traverse_obj,
unified_timestamp,
)
class BIQLEIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?biqle\.(?:com|org|ru)/watch/(?P<id>-?\d+_\d+)'
_TESTS = [{
'url': 'https://biqle.ru/watch/-2000421746_85421746',
'md5': 'ae6ef4f04d19ac84e4658046d02c151c',
'info_dict': {
'id': '-2000421746_85421746',
'ext': 'mp4',
'title': 'Forsaken By Hope Studio Clip',
'description': 'Forsaken By Hope Studio Clip — Смотреть онлайн',
'upload_date': '19700101',
'thumbnail': r're:https://[^/]+/impf/7vN3ACwSTgChP96OdOfzFjUCzFR6ZglDQgWsIw/KPaACiVJJxM\.jpg\?size=800x450&quality=96&keep_aspect_ratio=1&background=000000&sign=b48ea459c4d33dbcba5e26d63574b1cb&type=video_thumb',
'timestamp': 0,
},
}, {
'url': 'http://biqle.org/watch/-44781847_168547604',
'md5': '7f24e72af1db0edf7c1aaba513174f97',
'info_dict': {
'id': '-44781847_168547604',
'ext': 'mp4',
'title': 'Ребенок в шоке от автоматической мойки',
'description': 'Ребенок в шоке от автоматической мойки — Смотреть онлайн',
'timestamp': 1396633454,
'upload_date': '20140404',
'thumbnail': r're:https://[^/]+/c535507/u190034692/video/l_b84df002\.jpg',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_meta('name', webpage, 'Title', fatal=False)
timestamp = unified_timestamp(self._html_search_meta('uploadDate', webpage, 'Upload Date', default=None))
description = self._html_search_meta('description', webpage, 'Description', default=None)
global_embed_url = self._search_regex(
r'<script[^<]+?window.globEmbedUrl\s*=\s*\'((?:https?:)?//(?:daxab\.com|dxb\.to|[^/]+/player)/[^\']+)\'',
webpage, 'global Embed url')
hash = self._search_regex(
r'<script id="data-embed-video[^<]+?hash: "([^"]+)"[^<]*</script>', webpage, 'Hash')
embed_url = global_embed_url + hash
if VKIE.suitable(embed_url):
return self.url_result(embed_url, VKIE.ie_key(), video_id)
embed_page = self._download_webpage(
embed_url, video_id, 'Downloading embed webpage', headers={'Referer': url})
glob_params = self._parse_json(self._search_regex(
r'<script id="globParams">[^<]*window.globParams = ([^;]+);[^<]+</script>',
embed_page, 'Global Parameters'), video_id, transform_source=js_to_json)
host_name = compat_b64decode(glob_params['server'][::-1]).decode()
item = self._download_json(
f'https://{host_name}/method/video.get/{video_id}', video_id,
headers={'Referer': url}, query={
'token': glob_params['video']['access_token'],
'videos': video_id,
'ckey': glob_params['c_key'],
'credentials': glob_params['video']['credentials'],
})['response']['items'][0]
formats = []
for f_id, f_url in item.get('files', {}).items():
if f_id == 'external':
return self.url_result(f_url)
ext, height = f_id.split('_')
height_extra_key = traverse_obj(glob_params, ('video', 'partial', 'quality', height))
if height_extra_key:
formats.append({
'format_id': f'{height}p',
'url': f'https://{host_name}/{f_url[8:]}&videos={video_id}&extra_key={height_extra_key}',
'height': int_or_none(height),
'ext': ext,
})
thumbnails = []
for k, v in item.items():
if k.startswith('photo_') and v:
width = k.replace('photo_', '')
thumbnails.append({
'id': width,
'url': v,
'width': int_or_none(width),
})
return {
'id': video_id,
'title': title,
'formats': formats,
'comment_count': int_or_none(item.get('comments')),
'description': description,
'duration': int_or_none(item.get('duration')),
'thumbnails': thumbnails,
'timestamp': timestamp,
'view_count': int_or_none(item.get('views')),
}

View File

@@ -7,8 +7,10 @@ from ..utils import (
ExtractorError, ExtractorError,
OnDemandPagedList, OnDemandPagedList,
clean_html, clean_html,
extract_attributes,
get_element_by_class, get_element_by_class,
get_element_by_id, get_element_by_id,
get_element_html_by_class,
get_elements_html_by_class, get_elements_html_by_class,
int_or_none, int_or_none,
orderedSet, orderedSet,
@@ -17,6 +19,7 @@ from ..utils import (
traverse_obj, traverse_obj,
unified_strdate, unified_strdate,
urlencode_postdata, urlencode_postdata,
urljoin,
) )
@@ -34,6 +37,25 @@ class BitChuteIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'BitChute', 'uploader': 'BitChute',
'upload_date': '20170103', 'upload_date': '20170103',
'uploader_url': 'https://www.bitchute.com/profile/I5NgtHZn9vPj/',
'channel': 'BitChute',
'channel_url': 'https://www.bitchute.com/channel/bitchute/'
},
}, {
# test case: video with different channel and uploader
'url': 'https://www.bitchute.com/video/Yti_j9A-UZ4/',
'md5': 'f10e6a8e787766235946d0868703f1d0',
'info_dict': {
'id': 'Yti_j9A-UZ4',
'ext': 'mp4',
'title': 'Israel at War | Full Measure',
'description': 'md5:38cf7bc6f42da1a877835539111c69ef',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'sharylattkisson',
'upload_date': '20231106',
'uploader_url': 'https://www.bitchute.com/profile/9K0kUWA9zmd9/',
'channel': 'Full Measure with Sharyl Attkisson',
'channel_url': 'https://www.bitchute.com/channel/sharylattkisson/'
}, },
}, { }, {
# video not downloadable in browser, but we can recover it # video not downloadable in browser, but we can recover it
@@ -48,6 +70,9 @@ class BitChuteIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'BitChute', 'uploader': 'BitChute',
'upload_date': '20181113', 'upload_date': '20181113',
'uploader_url': 'https://www.bitchute.com/profile/I5NgtHZn9vPj/',
'channel': 'BitChute',
'channel_url': 'https://www.bitchute.com/channel/bitchute/'
}, },
'params': {'check_formats': None}, 'params': {'check_formats': None},
}, { }, {
@@ -99,6 +124,11 @@ class BitChuteIE(InfoExtractor):
reason = clean_html(get_element_by_id('page-detail', webpage)) or page_title reason = clean_html(get_element_by_id('page-detail', webpage)) or page_title
self.raise_geo_restricted(reason) self.raise_geo_restricted(reason)
@staticmethod
def _make_url(html):
path = extract_attributes(get_element_html_by_class('spa', html) or '').get('href')
return urljoin('https://www.bitchute.com', path)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage( webpage = self._download_webpage(
@@ -121,12 +151,19 @@ class BitChuteIE(InfoExtractor):
'Video is unavailable. Please make sure this video is playable in the browser ' 'Video is unavailable. Please make sure this video is playable in the browser '
'before reporting this issue.', expected=True, video_id=video_id) 'before reporting this issue.', expected=True, video_id=video_id)
details = get_element_by_class('details', webpage) or ''
uploader_html = get_element_html_by_class('creator', details) or ''
channel_html = get_element_html_by_class('name', details) or ''
return { return {
'id': video_id, 'id': video_id,
'title': self._html_extract_title(webpage) or self._og_search_title(webpage), 'title': self._html_extract_title(webpage) or self._og_search_title(webpage),
'description': self._og_search_description(webpage, default=None), 'description': self._og_search_description(webpage, default=None),
'thumbnail': self._og_search_thumbnail(webpage), 'thumbnail': self._og_search_thumbnail(webpage),
'uploader': clean_html(get_element_by_class('owner', webpage)), 'uploader': clean_html(uploader_html),
'uploader_url': self._make_url(uploader_html),
'channel': clean_html(channel_html),
'channel_url': self._make_url(channel_html),
'upload_date': unified_strdate(self._search_regex( 'upload_date': unified_strdate(self._search_regex(
r'at \d+:\d+ UTC on (.+?)\.', publish_date, 'upload date', fatal=False)), r'at \d+:\d+ UTC on (.+?)\.', publish_date, 'upload date', fatal=False)),
'formats': formats, 'formats': formats,
@@ -154,6 +191,9 @@ class BitChuteChannelIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'BitChute', 'uploader': 'BitChute',
'upload_date': '20170103', 'upload_date': '20170103',
'uploader_url': 'https://www.bitchute.com/profile/I5NgtHZn9vPj/',
'channel': 'BitChute',
'channel_url': 'https://www.bitchute.com/channel/bitchute/',
'duration': 16, 'duration': 16,
'view_count': int, 'view_count': int,
}, },
@@ -169,7 +209,7 @@ class BitChuteChannelIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': 'wV9Imujxasw9', 'id': 'wV9Imujxasw9',
'title': 'Bruce MacDonald and "The Light of Darkness"', 'title': 'Bruce MacDonald and "The Light of Darkness"',
'description': 'md5:04913227d2714af1d36d804aa2ab6b1e', 'description': 'md5:747724ef404eebdfc04277714f81863e',
} }
}] }]

View File

@@ -1,58 +0,0 @@
from .common import InfoExtractor
class BitwaveReplayIE(InfoExtractor):
IE_NAME = 'bitwave:replay'
_VALID_URL = r'https?://(?:www\.)?bitwave\.tv/(?P<user>\w+)/replay/(?P<id>\w+)/?$'
_TEST = {
'url': 'https://bitwave.tv/RhythmicCarnage/replay/z4P6eq5L7WDrM85UCrVr',
'only_matching': True
}
def _real_extract(self, url):
replay_id = self._match_id(url)
replay = self._download_json(
'https://api.bitwave.tv/v1/replays/' + replay_id,
replay_id
)
return {
'id': replay_id,
'title': replay['data']['title'],
'uploader': replay['data']['name'],
'uploader_id': replay['data']['name'],
'url': replay['data']['url'],
'thumbnails': [
{'url': x} for x in replay['data']['thumbnails']
],
}
class BitwaveStreamIE(InfoExtractor):
IE_NAME = 'bitwave:stream'
_VALID_URL = r'https?://(?:www\.)?bitwave\.tv/(?P<id>\w+)/?$'
_TEST = {
'url': 'https://bitwave.tv/doomtube',
'only_matching': True
}
def _real_extract(self, url):
username = self._match_id(url)
channel = self._download_json(
'https://api.bitwave.tv/v1/channels/' + username,
username)
formats = self._extract_m3u8_formats(
channel['data']['url'], username,
'mp4')
return {
'id': username,
'title': channel['data']['title'],
'uploader': username,
'uploader_id': username,
'formats': formats,
'thumbnail': channel['data']['thumbnail'],
'is_live': True,
'view_count': channel['data']['viewCount']
}

View File

@@ -22,7 +22,7 @@ class BleacherReportIE(InfoExtractor):
'upload_date': '20150615', 'upload_date': '20150615',
'uploader': 'Team Stream Now ', 'uploader': 'Team Stream Now ',
}, },
'add_ie': ['Ooyala'], 'skip': 'Video removed',
}, { }, {
'url': 'http://bleacherreport.com/articles/2586817-aussie-golfers-get-fright-of-their-lives-after-being-chased-by-angry-kangaroo', 'url': 'http://bleacherreport.com/articles/2586817-aussie-golfers-get-fright-of-their-lives-after-being-chased-by-angry-kangaroo',
'md5': '6a5cd403418c7b01719248ca97fb0692', 'md5': '6a5cd403418c7b01719248ca97fb0692',
@@ -70,8 +70,6 @@ class BleacherReportIE(InfoExtractor):
video_type = video['type'] video_type = video['type']
if video_type in ('cms.bleacherreport.com', 'vid.bleacherreport.com'): if video_type in ('cms.bleacherreport.com', 'vid.bleacherreport.com'):
info['url'] = 'http://bleacherreport.com/video_embed?id=%s' % video['id'] info['url'] = 'http://bleacherreport.com/video_embed?id=%s' % video['id']
elif video_type == 'ooyala.com':
info['url'] = 'ooyala:%s' % video['id']
elif video_type == 'youtube.com': elif video_type == 'youtube.com':
info['url'] = video['id'] info['url'] = video['id']
elif video_type == 'vine.co': elif video_type == 'vine.co':

View File

@@ -1,16 +1,17 @@
import json import json
import urllib.parse
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
determine_ext,
parse_iso8601, parse_iso8601,
# try_get,
update_url_query, update_url_query,
url_or_none,
) )
from ..utils.traversal import traverse_obj
class BoxIE(InfoExtractor): class BoxIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^.]+\.)?app\.box\.com/s/(?P<shared_name>[^/]+)/file/(?P<id>\d+)' _VALID_URL = r'https?://(?:[^.]+\.)?app\.box\.com/s/(?P<shared_name>[^/?#]+)/file/(?P<id>\d+)'
_TEST = { _TEST = {
'url': 'https://mlssoccer.app.box.com/s/0evd2o3e08l60lr4ygukepvnkord1o1x/file/510727257538', 'url': 'https://mlssoccer.app.box.com/s/0evd2o3e08l60lr4ygukepvnkord1o1x/file/510727257538',
'md5': '1f81b2fd3960f38a40a3b8823e5fcd43', 'md5': '1f81b2fd3960f38a40a3b8823e5fcd43',
@@ -18,11 +19,12 @@ class BoxIE(InfoExtractor):
'id': '510727257538', 'id': '510727257538',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Garber St. Louis will be 28th MLS team +scarving.mp4', 'title': 'Garber St. Louis will be 28th MLS team +scarving.mp4',
'uploader': 'MLS Video', 'uploader': '',
'timestamp': 1566320259, 'timestamp': 1566320259,
'upload_date': '20190820', 'upload_date': '20190820',
'uploader_id': '235196876', 'uploader_id': '235196876',
} },
'params': {'skip_download': 'dash fragment too small'},
} }
def _real_extract(self, url): def _real_extract(self, url):
@@ -58,26 +60,15 @@ class BoxIE(InfoExtractor):
formats = [] formats = []
# for entry in (try_get(f, lambda x: x['representations']['entries'], list) or []): for url_tmpl in traverse_obj(f, (
# entry_url_template = try_get( 'representations', 'entries', lambda _, v: v['representation'] == 'dash',
# entry, lambda x: x['content']['url_template']) 'content', 'url_template', {url_or_none}
# if not entry_url_template: )):
# continue manifest_url = update_url_query(url_tmpl.replace('{+asset_path}', 'manifest.mpd'), query)
# representation = entry.get('representation') fmts = self._extract_mpd_formats(manifest_url, file_id)
# if representation == 'dash': for fmt in fmts:
# TODO: append query to every fragment URL fmt['extra_param_to_segment_url'] = urllib.parse.urlparse(manifest_url).query
# formats.extend(self._extract_mpd_formats( formats.extend(fmts)
# entry_url_template.replace('{+asset_path}', 'manifest.mpd'),
# file_id, query=query))
authenticated_download_url = f.get('authenticated_download_url')
if authenticated_download_url and f.get('is_download_available'):
formats.append({
'ext': f.get('extension') or determine_ext(title),
'filesize': f.get('size'),
'format_id': 'download',
'url': update_url_query(authenticated_download_url, query),
})
creator = f.get('created_by') or {} creator = f.get('created_by') or {}

View File

@@ -1,18 +1,15 @@
import json
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
determine_ext,
ExtractorError, ExtractorError,
int_or_none, int_or_none,
parse_duration, parse_duration,
parse_iso8601,
xpath_element, xpath_element,
xpath_text, xpath_text,
) )
class BRIE(InfoExtractor): class BRIE(InfoExtractor):
_WORKING = False
IE_DESC = 'Bayerischer Rundfunk' IE_DESC = 'Bayerischer Rundfunk'
_VALID_URL = r'(?P<base_url>https?://(?:www\.)?br(?:-klassik)?\.de)/(?:[a-z0-9\-_]+/)+(?P<id>[a-z0-9\-_]+)\.html' _VALID_URL = r'(?P<base_url>https?://(?:www\.)?br(?:-klassik)?\.de)/(?:[a-z0-9\-_]+/)+(?P<id>[a-z0-9\-_]+)\.html'
@@ -167,142 +164,3 @@ class BRIE(InfoExtractor):
} for variant in variants.findall('variant') if xpath_text(variant, 'url')] } for variant in variants.findall('variant') if xpath_text(variant, 'url')]
thumbnails.sort(key=lambda x: x['width'] * x['height'], reverse=True) thumbnails.sort(key=lambda x: x['width'] * x['height'], reverse=True)
return thumbnails return thumbnails
class BRMediathekIE(InfoExtractor):
IE_DESC = 'Bayerischer Rundfunk Mediathek'
_VALID_URL = r'https?://(?:www\.)?br\.de/mediathek//?video/(?:[^/?&#]+?-)?(?P<id>av:[0-9a-f]{24})'
_TESTS = [{
'url': 'https://www.br.de/mediathek/video/gesundheit-die-sendung-vom-28112017-av:5a1e6a6e8fce6d001871cc8e',
'md5': 'fdc3d485835966d1622587d08ba632ec',
'info_dict': {
'id': 'av:5a1e6a6e8fce6d001871cc8e',
'ext': 'mp4',
'title': 'Die Sendung vom 28.11.2017',
'description': 'md5:6000cdca5912ab2277e5b7339f201ccc',
'timestamp': 1511942766,
'upload_date': '20171129',
}
}, {
'url': 'https://www.br.de/mediathek//video/av:61b0db581aed360007558c12',
'only_matching': True,
}]
def _real_extract(self, url):
clip_id = self._match_id(url)
clip = self._download_json(
'https://proxy-base.master.mango.express/graphql',
clip_id, data=json.dumps({
"query": """{
viewer {
clip(id: "%s") {
title
description
duration
createdAt
ageRestriction
videoFiles {
edges {
node {
publicLocation
fileSize
videoProfile {
width
height
bitrate
encoding
}
}
}
}
captionFiles {
edges {
node {
publicLocation
}
}
}
teaserImages {
edges {
node {
imageFiles {
edges {
node {
publicLocation
width
height
}
}
}
}
}
}
}
}
}""" % clip_id}).encode(), headers={
'Content-Type': 'application/json',
})['data']['viewer']['clip']
title = clip['title']
formats = []
for edge in clip.get('videoFiles', {}).get('edges', []):
node = edge.get('node', {})
n_url = node.get('publicLocation')
if not n_url:
continue
ext = determine_ext(n_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
n_url, clip_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
else:
video_profile = node.get('videoProfile', {})
tbr = int_or_none(video_profile.get('bitrate'))
format_id = 'http'
if tbr:
format_id += '-%d' % tbr
formats.append({
'format_id': format_id,
'url': n_url,
'width': int_or_none(video_profile.get('width')),
'height': int_or_none(video_profile.get('height')),
'tbr': tbr,
'filesize': int_or_none(node.get('fileSize')),
})
subtitles = {}
for edge in clip.get('captionFiles', {}).get('edges', []):
node = edge.get('node', {})
n_url = node.get('publicLocation')
if not n_url:
continue
subtitles.setdefault('de', []).append({
'url': n_url,
})
thumbnails = []
for edge in clip.get('teaserImages', {}).get('edges', []):
for image_edge in edge.get('node', {}).get('imageFiles', {}).get('edges', []):
node = image_edge.get('node', {})
n_url = node.get('publicLocation')
if not n_url:
continue
thumbnails.append({
'url': n_url,
'width': int_or_none(node.get('width')),
'height': int_or_none(node.get('height')),
})
return {
'id': clip_id,
'title': title,
'description': clip.get('description'),
'duration': int_or_none(clip.get('duration')),
'timestamp': parse_iso8601(clip.get('createdAt')),
'age_limit': int_or_none(clip.get('ageRestriction')),
'formats': formats,
'subtitles': subtitles,
'thumbnails': thumbnails,
}

View File

@@ -1,86 +0,0 @@
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..utils import (
int_or_none,
url_or_none,
)
class BreakIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?break\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)'
_TESTS = [{
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
'info_dict': {
'id': '2468056',
'ext': 'mp4',
'title': 'When Girls Act Like D-Bags',
'age_limit': 13,
},
}, {
# youtube embed
'url': 'http://www.break.com/video/someone-forgot-boat-brakes-work',
'info_dict': {
'id': 'RrrDLdeL2HQ',
'ext': 'mp4',
'title': 'Whale Watching Boat Crashing Into San Diego Dock',
'description': 'md5:afc1b2772f0a8468be51dd80eb021069',
'upload_date': '20160331',
'uploader': 'Steve Holden',
'uploader_id': 'sdholden07',
},
'params': {
'skip_download': True,
}
}, {
'url': 'http://www.break.com/video/ugc/baby-flex-2773063',
'only_matching': True,
}]
def _real_extract(self, url):
display_id, video_id = self._match_valid_url(url).groups()
webpage = self._download_webpage(url, display_id)
youtube_url = YoutubeIE._extract_url(webpage)
if youtube_url:
return self.url_result(youtube_url, ie=YoutubeIE.ie_key())
content = self._parse_json(
self._search_regex(
r'(?s)content["\']\s*:\s*(\[.+?\])\s*[,\n]', webpage,
'content'),
display_id)
formats = []
for video in content:
video_url = url_or_none(video.get('url'))
if not video_url:
continue
bitrate = int_or_none(self._search_regex(
r'(\d+)_kbps', video_url, 'tbr', default=None))
formats.append({
'url': video_url,
'format_id': 'http-%d' % bitrate if bitrate else 'http',
'tbr': bitrate,
})
title = self._search_regex(
(r'title["\']\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
r'<h1[^>]*>(?P<value>[^<]+)'), webpage, 'title', group='value')
def get(key, name):
return int_or_none(self._search_regex(
r'%s["\']\s*:\s*["\'](\d+)' % key, webpage, name,
default=None))
age_limit = get('ratings', 'age limit')
video_id = video_id or get('pid', 'video id') or display_id
return {
'id': video_id,
'display_id': display_id,
'title': title,
'thumbnail': self._og_search_thumbnail(webpage),
'age_limit': age_limit,
'formats': formats,
}

View File

@@ -0,0 +1,123 @@
import re
from functools import partial
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
bug_reports_message,
clean_html,
format_field,
get_element_text_and_html_by_tag,
int_or_none,
url_or_none,
)
from ..utils.traversal import traverse_obj
class BundestagIE(InfoExtractor):
_VALID_URL = [
r'https?://dbtg\.tv/[cf]vid/(?P<id>\d+)',
r'https?://www\.bundestag\.de/mediathek/?\?(?:[^#]+&)?videoid=(?P<id>\d+)',
]
_TESTS = [{
'url': 'https://dbtg.tv/cvid/7605304',
'info_dict': {
'id': '7605304',
'ext': 'mp4',
'title': '145. Sitzung vom 15.12.2023, TOP 24 Barrierefreiheit',
'description': 'md5:321a9dc6bdad201264c0045efc371561',
},
}, {
'url': 'https://www.bundestag.de/mediathek?videoid=7602120&url=L21lZGlhdGhla292ZXJsYXk=&mod=mediathek',
'info_dict': {
'id': '7602120',
'ext': 'mp4',
'title': '130. Sitzung vom 18.10.2023, TOP 1 Befragung der Bundesregierung',
'description': 'Befragung der Bundesregierung',
},
}, {
'url': 'https://www.bundestag.de/mediathek?videoid=7604941#url=L21lZGlhdGhla292ZXJsYXk/dmlkZW9pZD03NjA0OTQx&mod=mediathek',
'only_matching': True,
}, {
'url': 'http://dbtg.tv/fvid/3594346',
'only_matching': True,
}]
_OVERLAY_URL = 'https://www.bundestag.de/mediathekoverlay'
_INSTANCE_FORMAT = 'https://cldf-wzw-od.r53.cdn.tv1.eu/13014bundestagod/_definst_/13014bundestag/ondemand/3777parlamentsfernsehen/archiv/app144277506/145293313/{0}/{0}_playlist.smil/playlist.m3u8'
_SHARE_URL = 'https://webtv.bundestag.de/player/macros/_x_s-144277506/shareData.json?contentId='
_SHARE_AUDIO_REGEX = r'/\d+_(?P<codec>\w+)_(?P<bitrate>\d+)kb_(?P<channels>\w+)_\w+_\d+\.(?P<ext>\w+)'
_SHARE_VIDEO_REGEX = r'/\d+_(?P<codec>\w+)_(?P<width>\w+)_(?P<height>\w+)_(?P<bitrate>\d+)kb_\w+_\w+_\d+\.(?P<ext>\w+)'
def _bt_extract_share_formats(self, video_id):
share_data = self._download_json(
f'{self._SHARE_URL}{video_id}', video_id, note='Downloading share format JSON')
if traverse_obj(share_data, ('status', 'code', {int})) != 1:
self.report_warning(format_field(
share_data, [('status', 'message', {str})],
'Share API response: %s', default='Unknown Share API Error')
+ bug_reports_message())
return
for name, url in share_data.items():
if not isinstance(name, str) or not url_or_none(url):
continue
elif name.startswith('audio'):
match = re.search(self._SHARE_AUDIO_REGEX, url)
yield {
'format_id': name,
'url': url,
'vcodec': 'none',
**traverse_obj(match, {
'acodec': 'codec',
'audio_channels': ('channels', {{'mono': 1, 'stereo': 2}.get}),
'abr': ('bitrate', {int_or_none}),
'ext': 'ext',
}),
}
elif name.startswith('download'):
match = re.search(self._SHARE_VIDEO_REGEX, url)
yield {
'format_id': name,
'url': url,
**traverse_obj(match, {
'vcodec': 'codec',
'tbr': ('bitrate', {int_or_none}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
'ext': 'ext',
}),
}
def _real_extract(self, url):
video_id = self._match_id(url)
formats = []
result = {'id': video_id, 'formats': formats}
try:
formats.extend(self._extract_m3u8_formats(
self._INSTANCE_FORMAT.format(video_id), video_id, m3u8_id='instance'))
except ExtractorError as error:
if isinstance(error.cause, HTTPError) and error.cause.status == 404:
raise ExtractorError('Could not find video id', expected=True)
self.report_warning(f'Error extracting hls formats: {error}', video_id)
formats.extend(self._bt_extract_share_formats(video_id))
if not formats:
self.raise_no_formats('Could not find suitable formats', video_id=video_id)
result.update(traverse_obj(self._download_webpage(
self._OVERLAY_URL, video_id,
query={'videoid': video_id, 'view': 'main'},
note='Downloading metadata overlay', fatal=False,
), {
'title': (
{partial(get_element_text_and_html_by_tag, 'h3')}, 0,
{partial(re.sub, r'<span[^>]*>[^<]+</span>', '')}, {clean_html}),
'description': ({partial(get_element_text_and_html_by_tag, 'p')}, 0, {clean_html}),
}))
return result

View File

@@ -8,9 +8,9 @@ from ..utils import (
class BYUtvIE(InfoExtractor): class BYUtvIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:www\.)?byutv\.org/(?:watch|player)/(?!event/)(?P<id>[0-9a-f-]+)(?:/(?P<display_id>[^/?#&]+))?' _VALID_URL = r'https?://(?:www\.)?byutv\.org/(?:watch|player)/(?!event/)(?P<id>[0-9a-f-]+)(?:/(?P<display_id>[^/?#&]+))?'
_TESTS = [{ _TESTS = [{
# ooyalaVOD
'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d/studio-c-season-5-episode-5', 'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d/studio-c-season-5-episode-5',
'info_dict': { 'info_dict': {
'id': 'ZvanRocTpW-G5_yZFeltTAMv6jxOU9KH', 'id': 'ZvanRocTpW-G5_yZFeltTAMv6jxOU9KH',
@@ -24,7 +24,6 @@ class BYUtvIE(InfoExtractor):
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
'add_ie': ['Ooyala'],
}, { }, {
# dvr # dvr
'url': 'https://www.byutv.org/player/8f1dab9b-b243-47c8-b525-3e2d021a3451/byu-softball-pacific-vs-byu-41219---game-2', 'url': 'https://www.byutv.org/player/8f1dab9b-b243-47c8-b525-3e2d021a3451/byu-softball-pacific-vs-byu-41219---game-2',
@@ -63,19 +62,6 @@ class BYUtvIE(InfoExtractor):
'x-byutv-platformkey': 'xsaaw9c7y5', 'x-byutv-platformkey': 'xsaaw9c7y5',
}) })
ep = video.get('ooyalaVOD')
if ep:
return {
'_type': 'url_transparent',
'ie_key': 'Ooyala',
'url': 'ooyala:%s' % ep['providerId'],
'id': video_id,
'display_id': display_id,
'title': ep.get('title'),
'description': ep.get('description'),
'thumbnail': ep.get('imageThumbnail'),
}
info = {} info = {}
formats = [] formats = []
subtitles = {} subtitles = {}

View File

@@ -1,87 +0,0 @@
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
unified_strdate,
)
class CamWithHerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?camwithher\.tv/view_video\.php\?.*\bviewkey=(?P<id>\w+)'
_TESTS = [{
'url': 'http://camwithher.tv/view_video.php?viewkey=6e9a24e2c0e842e1f177&page=&viewtype=&category=',
'info_dict': {
'id': '5644',
'ext': 'flv',
'title': 'Periscope Tease',
'description': 'In the clouds teasing on periscope to my favorite song',
'duration': 240,
'view_count': int,
'comment_count': int,
'uploader': 'MileenaK',
'upload_date': '20160322',
'age_limit': 18,
},
'params': {
'skip_download': True,
}
}, {
'url': 'http://camwithher.tv/view_video.php?viewkey=6dfd8b7c97531a459937',
'only_matching': True,
}, {
'url': 'http://camwithher.tv/view_video.php?page=&viewkey=6e9a24e2c0e842e1f177&viewtype=&category=',
'only_matching': True,
}, {
'url': 'http://camwithher.tv/view_video.php?viewkey=b6c3b5bea9515d1a1fc4&page=&viewtype=&category=mv',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
flv_id = self._html_search_regex(
r'<a[^>]+href=["\']/download/\?v=(\d+)', webpage, 'video id')
# Video URL construction algorithm is reverse-engineered from cwhplayer.swf
rtmp_url = 'rtmp://camwithher.tv/clipshare/%s' % (
('mp4:%s.mp4' % flv_id) if int(flv_id) > 2010 else flv_id)
title = self._html_search_regex(
r'<div[^>]+style="float:left"[^>]*>\s*<h2>(.+?)</h2>', webpage, 'title')
description = self._html_search_regex(
r'>Description:</span>(.+?)</div>', webpage, 'description', default=None)
runtime = self._search_regex(
r'Runtime\s*:\s*(.+?) \|', webpage, 'duration', default=None)
if runtime:
runtime = re.sub(r'[\s-]', '', runtime)
duration = parse_duration(runtime)
view_count = int_or_none(self._search_regex(
r'Views\s*:\s*(\d+)', webpage, 'view count', default=None))
comment_count = int_or_none(self._search_regex(
r'Comments\s*:\s*(\d+)', webpage, 'comment count', default=None))
uploader = self._search_regex(
r'Added by\s*:\s*<a[^>]+>([^<]+)</a>', webpage, 'uploader', default=None)
upload_date = unified_strdate(self._search_regex(
r'Added on\s*:\s*([\d-]+)', webpage, 'upload date', default=None))
return {
'id': flv_id,
'url': rtmp_url,
'ext': 'flv',
'no_resume': True,
'title': title,
'description': description,
'duration': duration,
'view_count': view_count,
'comment_count': comment_count,
'uploader': uploader,
'upload_date': upload_date,
'age_limit': 18
}

View File

@@ -1,105 +0,0 @@
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
format_field,
float_or_none,
int_or_none,
try_get,
)
from .videomore import VideomoreIE
class CarambaTVIE(InfoExtractor):
_VALID_URL = r'(?:carambatv:|https?://video1\.carambatv\.ru/v/)(?P<id>\d+)'
_TESTS = [{
'url': 'http://video1.carambatv.ru/v/191910501',
'md5': '2f4a81b7cfd5ab866ee2d7270cb34a2a',
'info_dict': {
'id': '191910501',
'ext': 'mp4',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 2678.31,
},
}, {
'url': 'carambatv:191910501',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_json(
'http://video1.carambatv.ru/v/%s/videoinfo.js' % video_id,
video_id)
title = video['title']
base_url = video.get('video') or 'http://video1.carambatv.ru/v/%s/' % video_id
formats = [{
'url': base_url + f['fn'],
'height': int_or_none(f.get('height')),
'format_id': format_field(f, 'height', '%sp'),
} for f in video['qualities'] if f.get('fn')]
thumbnail = video.get('splash')
duration = float_or_none(try_get(
video, lambda x: x['annotations'][0]['end_time'], compat_str))
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
}
class CarambaTVPageIE(InfoExtractor):
_VALID_URL = r'https?://carambatv\.ru/(?:[^/]+/)+(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://carambatv.ru/movie/bad-comedian/razborka-v-manile/',
'md5': 'a49fb0ec2ad66503eeb46aac237d3c86',
'info_dict': {
'id': '475222',
'ext': 'flv',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': r're:^https?://.*\.jpg',
# duration reported by videomore is incorrect
'duration': int,
},
'add_ie': [VideomoreIE.ie_key()],
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
videomore_url = VideomoreIE._extract_url(webpage)
if not videomore_url:
videomore_id = self._search_regex(
r'getVMCode\s*\(\s*["\']?(\d+)', webpage, 'videomore id',
default=None)
if videomore_id:
videomore_url = 'videomore:%s' % videomore_id
if videomore_url:
title = self._og_search_title(webpage)
return {
'_type': 'url_transparent',
'url': videomore_url,
'ie_key': VideomoreIE.ie_key(),
'title': title,
}
video_url = self._og_search_property('video:iframe', webpage, default=None)
if not video_url:
video_id = self._search_regex(
r'(?:video_id|crmb_vuid)\s*[:=]\s*["\']?(\d+)',
webpage, 'video id')
video_url = 'carambatv:%s' % video_id
return self.url_result(video_url, CarambaTVIE.ie_key())

View File

@@ -180,6 +180,13 @@ class CBCPlayerIE(InfoExtractor):
'thumbnail': 'http://thumbnails.cbc.ca/maven_legacy/thumbnails/sonali-karnick-220.jpg', 'thumbnail': 'http://thumbnails.cbc.ca/maven_legacy/thumbnails/sonali-karnick-220.jpg',
'chapters': [], 'chapters': [],
'duration': 494.811, 'duration': 494.811,
'categories': ['AudioMobile/All in a Weekend Montreal'],
'tags': 'count:8',
'location': 'Quebec',
'series': 'All in a Weekend Montreal',
'season': 'Season 2015',
'season_number': 2015,
'media_type': 'Excerpt',
}, },
}, { }, {
'url': 'http://www.cbc.ca/player/play/2164402062', 'url': 'http://www.cbc.ca/player/play/2164402062',
@@ -195,25 +202,37 @@ class CBCPlayerIE(InfoExtractor):
'thumbnail': 'https://thumbnails.cbc.ca/maven_legacy/thumbnails/277/67/cancer_852x480_2164412612.jpg', 'thumbnail': 'https://thumbnails.cbc.ca/maven_legacy/thumbnails/277/67/cancer_852x480_2164412612.jpg',
'chapters': [], 'chapters': [],
'duration': 186.867, 'duration': 186.867,
'series': 'CBC News: Windsor at 6:00',
'categories': ['News/Canada/Windsor'],
'location': 'Windsor',
'tags': ['cancer'],
'creator': 'Allison Johnson',
'media_type': 'Excerpt',
}, },
}, { }, {
# Has subtitles # Has subtitles
# These broadcasts expire after ~1 month, can find new test URL here: # These broadcasts expire after ~1 month, can find new test URL here:
# https://www.cbc.ca/player/news/TV%20Shows/The%20National/Latest%20Broadcast # https://www.cbc.ca/player/news/TV%20Shows/The%20National/Latest%20Broadcast
'url': 'http://www.cbc.ca/player/play/2249992771553', 'url': 'http://www.cbc.ca/player/play/2284799043667',
'md5': '2f2fb675dd4f0f8a5bb7588d1b13bacd', 'md5': '9b49f0839e88b6ec0b01d840cf3d42b5',
'info_dict': { 'info_dict': {
'id': '2249992771553', 'id': '2284799043667',
'ext': 'mp4', 'ext': 'mp4',
'title': 'The National | Womens soccer pay, Florida seawater, Swift quake', 'title': 'The National | Hockey coach charged, Green grants, Safer drugs',
'description': 'md5:adba28011a56cfa47a080ff198dad27a', 'description': 'md5:84ef46321c94bcf7d0159bb565d26bfa',
'timestamp': 1690596000, 'timestamp': 1700272800,
'duration': 2716.333, 'duration': 2718.833,
'subtitles': {'eng': [{'ext': 'vtt', 'protocol': 'm3u8_native'}]}, 'subtitles': {'eng': [{'ext': 'vtt', 'protocol': 'm3u8_native'}]},
'thumbnail': 'https://thumbnails.cbc.ca/maven_legacy/thumbnails/481/326/thumbnail.jpeg', 'thumbnail': 'https://thumbnails.cbc.ca/maven_legacy/thumbnails/907/171/thumbnail.jpeg',
'uploader': 'CBCC-NEW', 'uploader': 'CBCC-NEW',
'chapters': 'count:5', 'chapters': 'count:5',
'upload_date': '20230729', 'upload_date': '20231118',
'categories': 'count:4',
'series': 'The National - Full Show',
'tags': 'count:1',
'creator': 'News',
'location': 'Canada',
'media_type': 'Full Program',
}, },
}] }]

View File

@@ -1,252 +0,0 @@
import re
from .common import InfoExtractor
from ..utils import (
clean_html,
int_or_none,
parse_iso8601,
qualities,
unescapeHTML,
)
class Channel9IE(InfoExtractor):
IE_DESC = 'Channel 9'
IE_NAME = 'channel9'
_VALID_URL = r'https?://(?:www\.)?(?:channel9\.msdn\.com|s\.ch9\.ms)/(?P<contentpath>.+?)(?P<rss>/RSS)?/?(?:[?#&]|$)'
_EMBED_REGEX = [r'<iframe[^>]+src=["\'](?P<url>https?://channel9\.msdn\.com/(?:[^/]+/)+)player\b']
_TESTS = [{
'url': 'http://channel9.msdn.com/Events/TechEd/Australia/2013/KOS002',
'md5': '32083d4eaf1946db6d454313f44510ca',
'info_dict': {
'id': '6c413323-383a-49dc-88f9-a22800cab024',
'ext': 'wmv',
'title': 'Developer Kick-Off Session: Stuff We Love',
'description': 'md5:b80bf9355a503c193aff7ec6cd5a7731',
'duration': 4576,
'thumbnail': r're:https?://.*\.jpg',
'timestamp': 1377717420,
'upload_date': '20130828',
'session_code': 'KOS002',
'session_room': 'Arena 1A',
'session_speakers': 'count:5',
},
}, {
'url': 'http://channel9.msdn.com/posts/Self-service-BI-with-Power-BI-nuclear-testing',
'md5': 'dcf983ee6acd2088e7188c3cf79b46bc',
'info_dict': {
'id': 'fe8e435f-bb93-4e01-8e97-a28c01887024',
'ext': 'wmv',
'title': 'Self-service BI with Power BI - nuclear testing',
'description': 'md5:2d17fec927fc91e9e17783b3ecc88f54',
'duration': 1540,
'thumbnail': r're:https?://.*\.jpg',
'timestamp': 1386381991,
'upload_date': '20131207',
'authors': ['Mike Wilmot'],
},
}, {
# low quality mp4 is best
'url': 'https://channel9.msdn.com/Events/CPP/CppCon-2015/Ranges-for-the-Standard-Library',
'info_dict': {
'id': '33ad69d2-6a4e-4172-83a1-a523013dec76',
'ext': 'mp4',
'title': 'Ranges for the Standard Library',
'description': 'md5:9895e0a9fd80822d2f01c454b8f4a372',
'duration': 5646,
'thumbnail': r're:https?://.*\.jpg',
'upload_date': '20150930',
'timestamp': 1443640735,
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://channel9.msdn.com/Events/DEVintersection/DEVintersection-2016/RSS',
'info_dict': {
'id': 'Events/DEVintersection/DEVintersection-2016',
'title': 'DEVintersection 2016 Orlando Sessions',
},
'playlist_mincount': 14,
}, {
'url': 'https://channel9.msdn.com/Niners/Splendid22/Queue/76acff796e8f411184b008028e0d492b/RSS',
'only_matching': True,
}, {
'url': 'https://channel9.msdn.com/Events/Speakers/scott-hanselman/RSS?UrlSafeName=scott-hanselman',
'only_matching': True,
}]
_RSS_URL = 'http://channel9.msdn.com/%s/RSS'
def _extract_list(self, video_id, rss_url=None):
if not rss_url:
rss_url = self._RSS_URL % video_id
rss = self._download_xml(rss_url, video_id, 'Downloading RSS')
entries = [self.url_result(session_url.text, 'Channel9')
for session_url in rss.findall('./channel/item/link')]
title_text = rss.find('./channel/title').text
return self.playlist_result(entries, video_id, title_text)
def _real_extract(self, url):
content_path, rss = self._match_valid_url(url).groups()
if rss:
return self._extract_list(content_path, url)
webpage = self._download_webpage(
url, content_path, 'Downloading web page')
episode_data = self._search_regex(
r"data-episode='([^']+)'", webpage, 'episode data', default=None)
if episode_data:
episode_data = self._parse_json(unescapeHTML(
episode_data), content_path)
content_id = episode_data['contentId']
is_session = '/Sessions(' in episode_data['api']
content_url = 'https://channel9.msdn.com/odata' + episode_data['api'] + '?$select=Captions,CommentCount,MediaLengthInSeconds,PublishedDate,Rating,RatingCount,Title,VideoMP4High,VideoMP4Low,VideoMP4Medium,VideoPlayerPreviewImage,VideoWMV,VideoWMVHQ,Views,'
if is_session:
content_url += 'Code,Description,Room,Slides,Speakers,ZipFile&$expand=Speakers'
else:
content_url += 'Authors,Body&$expand=Authors'
content_data = self._download_json(content_url, content_id)
title = content_data['Title']
QUALITIES = (
'mp3',
'wmv', 'mp4',
'wmv-low', 'mp4-low',
'wmv-mid', 'mp4-mid',
'wmv-high', 'mp4-high',
)
quality_key = qualities(QUALITIES)
def quality(quality_id, format_url):
return (len(QUALITIES) if '_Source.' in format_url
else quality_key(quality_id))
formats = []
urls = set()
SITE_QUALITIES = {
'MP3': 'mp3',
'MP4': 'mp4',
'Low Quality WMV': 'wmv-low',
'Low Quality MP4': 'mp4-low',
'Mid Quality WMV': 'wmv-mid',
'Mid Quality MP4': 'mp4-mid',
'High Quality WMV': 'wmv-high',
'High Quality MP4': 'mp4-high',
}
formats_select = self._search_regex(
r'(?s)<select[^>]+name=["\']format[^>]+>(.+?)</select', webpage,
'formats select', default=None)
if formats_select:
for mobj in re.finditer(
r'<option\b[^>]+\bvalue=(["\'])(?P<url>(?:(?!\1).)+)\1[^>]*>\s*(?P<format>[^<]+?)\s*<',
formats_select):
format_url = mobj.group('url')
if format_url in urls:
continue
urls.add(format_url)
format_id = mobj.group('format')
quality_id = SITE_QUALITIES.get(format_id, format_id)
formats.append({
'url': format_url,
'format_id': quality_id,
'quality': quality(quality_id, format_url),
'vcodec': 'none' if quality_id == 'mp3' else None,
})
API_QUALITIES = {
'VideoMP4Low': 'mp4-low',
'VideoWMV': 'wmv-mid',
'VideoMP4Medium': 'mp4-mid',
'VideoMP4High': 'mp4-high',
'VideoWMVHQ': 'wmv-hq',
}
for format_id, q in API_QUALITIES.items():
q_url = content_data.get(format_id)
if not q_url or q_url in urls:
continue
urls.add(q_url)
formats.append({
'url': q_url,
'format_id': q,
'quality': quality(q, q_url),
})
slides = content_data.get('Slides')
zip_file = content_data.get('ZipFile')
if not formats and not slides and not zip_file:
self.raise_no_formats(
'None of recording, slides or zip are available for %s' % content_path)
subtitles = {}
for caption in content_data.get('Captions', []):
caption_url = caption.get('Url')
if not caption_url:
continue
subtitles.setdefault(caption.get('Language', 'en'), []).append({
'url': caption_url,
'ext': 'vtt',
})
common = {
'id': content_id,
'title': title,
'description': clean_html(content_data.get('Description') or content_data.get('Body')),
'thumbnail': content_data.get('VideoPlayerPreviewImage'),
'duration': int_or_none(content_data.get('MediaLengthInSeconds')),
'timestamp': parse_iso8601(content_data.get('PublishedDate')),
'avg_rating': int_or_none(content_data.get('Rating')),
'rating_count': int_or_none(content_data.get('RatingCount')),
'view_count': int_or_none(content_data.get('Views')),
'comment_count': int_or_none(content_data.get('CommentCount')),
'subtitles': subtitles,
}
if is_session:
speakers = []
for s in content_data.get('Speakers', []):
speaker_name = s.get('FullName')
if not speaker_name:
continue
speakers.append(speaker_name)
common.update({
'session_code': content_data.get('Code'),
'session_room': content_data.get('Room'),
'session_speakers': speakers,
})
else:
authors = []
for a in content_data.get('Authors', []):
author_name = a.get('DisplayName')
if not author_name:
continue
authors.append(author_name)
common['authors'] = authors
contents = []
if slides:
d = common.copy()
d.update({'title': title + '-Slides', 'url': slides})
contents.append(d)
if zip_file:
d = common.copy()
d.update({'title': title + '-Zip', 'url': zip_file})
contents.append(d)
if formats:
d = common.copy()
d.update({'title': title, 'formats': formats})
contents.append(d)
return self.playlist_result(contents)
else:
return self._extract_list(content_path)

View File

@@ -1,88 +0,0 @@
import re
from .common import InfoExtractor
from ..compat import compat_b64decode
from ..utils import parse_duration
class ChirbitIE(InfoExtractor):
IE_NAME = 'chirbit'
_VALID_URL = r'https?://(?:www\.)?chirb\.it/(?:(?:wp|pl)/|fb_chirbit_player\.swf\?key=)?(?P<id>[\da-zA-Z]+)'
_TESTS = [{
'url': 'http://chirb.it/be2abG',
'info_dict': {
'id': 'be2abG',
'ext': 'mp3',
'title': 'md5:f542ea253f5255240be4da375c6a5d7e',
'description': 'md5:f24a4e22a71763e32da5fed59e47c770',
'duration': 306,
'uploader': 'Gerryaudio',
},
'params': {
'skip_download': True,
}
}, {
'url': 'https://chirb.it/fb_chirbit_player.swf?key=PrIPv5',
'only_matching': True,
}, {
'url': 'https://chirb.it/wp/MN58c2',
'only_matching': True,
}]
def _real_extract(self, url):
audio_id = self._match_id(url)
webpage = self._download_webpage(
'http://chirb.it/%s' % audio_id, audio_id)
data_fd = self._search_regex(
r'data-fd=(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'data fd', group='url')
# Reverse engineered from https://chirb.it/js/chirbit.player.js (look
# for soundURL)
audio_url = compat_b64decode(data_fd[::-1]).decode('utf-8')
title = self._search_regex(
r'class=["\']chirbit-title["\'][^>]*>([^<]+)', webpage, 'title')
description = self._search_regex(
r'<h3>Description</h3>\s*<pre[^>]*>([^<]+)</pre>',
webpage, 'description', default=None)
duration = parse_duration(self._search_regex(
r'class=["\']c-length["\'][^>]*>([^<]+)',
webpage, 'duration', fatal=False))
uploader = self._search_regex(
r'id=["\']chirbit-username["\'][^>]*>([^<]+)',
webpage, 'uploader', fatal=False)
return {
'id': audio_id,
'url': audio_url,
'title': title,
'description': description,
'duration': duration,
'uploader': uploader,
}
class ChirbitProfileIE(InfoExtractor):
IE_NAME = 'chirbit:profile'
_VALID_URL = r'https?://(?:www\.)?chirbit\.com/(?:rss/)?(?P<id>[^/]+)'
_TEST = {
'url': 'http://chirbit.com/ScarletBeauty',
'info_dict': {
'id': 'ScarletBeauty',
},
'playlist_mincount': 3,
}
def _real_extract(self, url):
profile_id = self._match_id(url)
webpage = self._download_webpage(url, profile_id)
entries = [
self.url_result(self._proto_relative_url('//chirb.it/' + video_id))
for _, video_id in re.findall(r'<input[^>]+id=([\'"])copy-btn-(?P<id>[0-9a-zA-Z]+)\1', webpage)]
return self.playlist_result(entries, profile_id)

View File

@@ -1,56 +0,0 @@
from .common import InfoExtractor
from ..utils import (
unified_strdate,
xpath_text,
)
class CinchcastIE(InfoExtractor):
_VALID_URL = r'https?://player\.cinchcast\.com/.*?(?:assetId|show_id)=(?P<id>[0-9]+)'
_EMBED_REGEX = [r'<iframe[^>]+?src=(["\'])(?P<url>https?://player\.cinchcast\.com/.+?)\1']
_TESTS = [{
'url': 'http://player.cinchcast.com/?show_id=5258197&platformId=1&assetType=single',
'info_dict': {
'id': '5258197',
'ext': 'mp3',
'title': 'Train Your Brain to Up Your Game with Coach Mandy',
'upload_date': '20130816',
},
}, {
# Actual test is run in generic, look for undergroundwellness
'url': 'http://player.cinchcast.com/?platformId=1&#038;assetType=single&#038;assetId=7141703',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
doc = self._download_xml(
'http://www.blogtalkradio.com/playerasset/mrss?assetType=single&assetId=%s' % video_id,
video_id)
item = doc.find('.//item')
title = xpath_text(item, './title', fatal=True)
date_str = xpath_text(
item, './{http://developer.longtailvideo.com/trac/}date')
upload_date = unified_strdate(date_str, day_first=False)
# duration is present but wrong
formats = [{
'format_id': 'main',
'url': item.find('./{http://search.yahoo.com/mrss/}content').attrib['url'],
}]
backup_url = xpath_text(
item, './{http://developer.longtailvideo.com/trac/}backupContent')
if backup_url:
formats.append({
'preference': 2, # seems to be more reliable
'format_id': 'backup',
'url': backup_url,
})
return {
'id': video_id,
'title': title,
'upload_date': upload_date,
'formats': formats,
}

View File

@@ -1,52 +0,0 @@
from .common import InfoExtractor
from ..utils import (
find_xpath_attr,
fix_xml_ampersands
)
class ClipsyndicateIE(InfoExtractor):
_VALID_URL = r'https?://(?:chic|www)\.clipsyndicate\.com/video/play(list/\d+)?/(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.clipsyndicate.com/video/play/4629301/brick_briscoe',
'md5': '4d7d549451bad625e0ff3d7bd56d776c',
'info_dict': {
'id': '4629301',
'ext': 'mp4',
'title': 'Brick Briscoe',
'duration': 612,
'thumbnail': r're:^https?://.+\.jpg',
},
}, {
'url': 'http://chic.clipsyndicate.com/video/play/5844117/shark_attack',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
js_player = self._download_webpage(
'http://eplayer.clipsyndicate.com/embed/player.js?va_id=%s' % video_id,
video_id, 'Downlaoding player')
# it includes a required token
flvars = self._search_regex(r'flvars: "(.*?)"', js_player, 'flvars')
pdoc = self._download_xml(
'http://eplayer.clipsyndicate.com/osmf/playlist?%s' % flvars,
video_id, 'Downloading video info',
transform_source=fix_xml_ampersands)
track_doc = pdoc.find('trackList/track')
def find_param(name):
node = find_xpath_attr(track_doc, './/param', 'name', name)
if node is not None:
return node.attrib['value']
return {
'id': video_id,
'title': find_param('title'),
'url': track_doc.find('location').text,
'thumbnail': find_param('thumbnail'),
'duration': int(find_param('duration')),
}

View File

@@ -1,57 +0,0 @@
from .common import InfoExtractor
from ..utils import (
str_to_int,
unified_strdate,
)
class CloudyIE(InfoExtractor):
_IE_DESC = 'cloudy.ec'
_VALID_URL = r'https?://(?:www\.)?cloudy\.ec/(?:v/|embed\.php\?.*?\bid=)(?P<id>[A-Za-z0-9]+)'
_TESTS = [{
'url': 'https://www.cloudy.ec/v/af511e2527aac',
'md5': '29832b05028ead1b58be86bf319397ca',
'info_dict': {
'id': 'af511e2527aac',
'ext': 'mp4',
'title': 'Funny Cats and Animals Compilation june 2013',
'upload_date': '20130913',
'view_count': int,
}
}, {
'url': 'http://www.cloudy.ec/embed.php?autoplay=1&id=af511e2527aac',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'https://www.cloudy.ec/embed.php', video_id, query={
'id': video_id,
'playerPage': 1,
'autoplay': 1,
})
info = self._parse_html5_media_entries(url, webpage, video_id)[0]
webpage = self._download_webpage(
'https://www.cloudy.ec/v/%s' % video_id, video_id, fatal=False)
if webpage:
info.update({
'title': self._search_regex(
r'<h\d[^>]*>([^<]+)<', webpage, 'title'),
'upload_date': unified_strdate(self._search_regex(
r'>Published at (\d{4}-\d{1,2}-\d{1,2})', webpage,
'upload date', fatal=False)),
'view_count': str_to_int(self._search_regex(
r'([\d,.]+) views<', webpage, 'view count', fatal=False)),
})
if not info.get('title'):
info['title'] = video_id
info['id'] = video_id
return info

View File

@@ -6,6 +6,7 @@ from ..utils import (
class ClubicIE(InfoExtractor): class ClubicIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:www\.)?clubic\.com/video/(?:[^/]+/)*video.*-(?P<id>[0-9]+)\.html' _VALID_URL = r'https?://(?:www\.)?clubic\.com/video/(?:[^/]+/)*video.*-(?P<id>[0-9]+)\.html'
_TESTS = [{ _TESTS = [{

View File

@@ -4,6 +4,7 @@ from .mtv import MTVIE
class CMTIE(MTVIE): # XXX: Do not subclass from concrete IE class CMTIE(MTVIE): # XXX: Do not subclass from concrete IE
_WORKING = False
IE_NAME = 'cmt.com' IE_NAME = 'cmt.com'
_VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows|(?:full-)?episodes|video-clips)/(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows|(?:full-)?episodes|video-clips)/(?P<id>[^/]+)'

View File

@@ -286,6 +286,9 @@ class InfoExtractor:
If it is not clear whether to use timestamp or this, use the former If it is not clear whether to use timestamp or this, use the former
release_date: The date (YYYYMMDD) when the video was released in UTC. release_date: The date (YYYYMMDD) when the video was released in UTC.
If not explicitly set, calculated from release_timestamp If not explicitly set, calculated from release_timestamp
release_year: Year (YYYY) as integer when the video or album was released.
To be used if no exact release date is known.
If not explicitly set, calculated from release_date.
modified_timestamp: UNIX timestamp of the moment the video was last modified. modified_timestamp: UNIX timestamp of the moment the video was last modified.
modified_date: The date (YYYYMMDD) when the video was last modified in UTC. modified_date: The date (YYYYMMDD) when the video was last modified in UTC.
If not explicitly set, calculated from modified_timestamp If not explicitly set, calculated from modified_timestamp
@@ -379,6 +382,7 @@ class InfoExtractor:
'private', 'premium_only', 'subscriber_only', 'needs_auth', 'private', 'premium_only', 'subscriber_only', 'needs_auth',
'unlisted' or 'public'. Use 'InfoExtractor._availability' 'unlisted' or 'public'. Use 'InfoExtractor._availability'
to set it to set it
media_type: The type of media as classified by the site, e.g. "episode", "clip", "trailer"
_old_archive_ids: A list of old archive ids needed for backward compatibility _old_archive_ids: A list of old archive ids needed for backward compatibility
_format_sort_fields: A list of fields to use for sorting formats _format_sort_fields: A list of fields to use for sorting formats
__post_extractor: A function to be called just before the metadata is __post_extractor: A function to be called just before the metadata is
@@ -427,7 +431,6 @@ class InfoExtractor:
and compilations). and compilations).
disc_number: Number of the disc or other physical medium the track belongs to, disc_number: Number of the disc or other physical medium the track belongs to,
as an integer. as an integer.
release_year: Year (YYYY) when the album was released.
composer: Composer of the piece composer: Composer of the piece
The following fields should only be set for clips that should be cut from the original video: The following fields should only be set for clips that should be cut from the original video:
@@ -2341,7 +2344,9 @@ class InfoExtractor:
imgs_count = 0 imgs_count = 0
srcs = set() srcs = set()
media = smil.findall(self._xpath_ns('.//video', namespace)) + smil.findall(self._xpath_ns('.//audio', namespace)) media = itertools.chain.from_iterable(
smil.findall(self._xpath_ns(arg, namespace))
for arg in ['.//video', './/audio', './/media'])
for medium in media: for medium in media:
src = medium.get('src') src = medium.get('src')
if not src or src in srcs: if not src or src in srcs:

View File

@@ -46,6 +46,10 @@ class CWTVIE(InfoExtractor):
'timestamp': 1444107300, 'timestamp': 1444107300,
'age_limit': 14, 'age_limit': 14,
'uploader': 'CWTV', 'uploader': 'CWTV',
'thumbnail': r're:^https?://.*\.jpe?g$',
'chapters': 'count:4',
'episode': 'Episode 20',
'season': 'Season 11',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download

View File

@@ -1,150 +0,0 @@
from .common import InfoExtractor
from ..compat import compat_b64decode
from ..utils import (
ExtractorError,
int_or_none,
js_to_json,
parse_count,
parse_duration,
traverse_obj,
try_get,
unified_timestamp,
)
class DaftsexIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?daft\.sex/watch/(?P<id>-?\d+_\d+)'
_TESTS = [{
'url': 'https://daft.sex/watch/-35370899_456246186',
'md5': '64c04ef7b4c7b04b308f3b0c78efe7cd',
'info_dict': {
'id': '-35370899_456246186',
'ext': 'mp4',
'title': 'just relaxing',
'description': 'just relaxing Watch video Watch video in high quality',
'upload_date': '20201113',
'timestamp': 1605261911,
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 18,
'duration': 15.0,
'view_count': int
},
}, {
'url': 'https://daft.sex/watch/-156601359_456242791',
'info_dict': {
'id': '-156601359_456242791',
'ext': 'mp4',
'title': 'Skye Blue - Dinner And A Show',
'description': 'Skye Blue - Dinner And A Show - Watch video Watch video in high quality',
'upload_date': '20200916',
'timestamp': 1600250735,
'thumbnail': 'https://psv153-1.crazycloud.ru/videos/-156601359/456242791/thumb.jpg?extra=i3D32KaBbBFf9TqDRMAVmQ',
},
'skip': 'deleted / private'
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_meta('name', webpage, 'title')
timestamp = unified_timestamp(self._html_search_meta('uploadDate', webpage, 'Upload Date', default=None))
description = self._html_search_meta('description', webpage, 'Description', default=None)
duration = parse_duration(self._search_regex(
r'Duration: ((?:[0-9]{2}:){0,2}[0-9]{2})',
webpage, 'duration', fatal=False))
views = parse_count(self._search_regex(
r'Views: ([0-9 ]+)',
webpage, 'views', fatal=False))
player_hash = self._search_regex(
r'DaxabPlayer\.Init\({[\s\S]*hash:\s*"([0-9a-zA-Z_\-]+)"[\s\S]*}',
webpage, 'player hash')
player_color = self._search_regex(
r'DaxabPlayer\.Init\({[\s\S]*color:\s*"([0-9a-z]+)"[\s\S]*}',
webpage, 'player color', fatal=False) or ''
embed_page = self._download_webpage(
'https://dxb.to/player/%s?color=%s' % (player_hash, player_color),
video_id, headers={'Referer': url})
video_params = self._parse_json(
self._search_regex(
r'window\.globParams\s*=\s*({[\S\s]+})\s*;\s*<\/script>',
embed_page, 'video parameters'),
video_id, transform_source=js_to_json)
server_domain = 'https://%s' % compat_b64decode(video_params['server'][::-1]).decode('utf-8')
cdn_files = traverse_obj(video_params, ('video', 'cdn_files')) or {}
if cdn_files:
formats = []
for format_id, format_data in cdn_files.items():
ext, height = format_id.split('_')
formats.append({
'format_id': format_id,
'url': f'{server_domain}/videos/{video_id.replace("_", "/")}/{height}.mp4?extra={format_data.split(".")[-1]}',
'height': int_or_none(height),
'ext': ext,
})
return {
'id': video_id,
'title': title,
'formats': formats,
'description': description,
'duration': duration,
'thumbnail': try_get(video_params, lambda vi: 'https:' + compat_b64decode(vi['video']['thumb']).decode('utf-8')),
'timestamp': timestamp,
'view_count': views,
'age_limit': 18,
}
items = self._download_json(
f'{server_domain}/method/video.get/{video_id}', video_id,
headers={'Referer': url}, query={
'token': video_params['video']['access_token'],
'videos': video_id,
'ckey': video_params['c_key'],
'credentials': video_params['video']['credentials'],
})['response']['items']
if not items:
raise ExtractorError('Video is not available', video_id=video_id, expected=True)
item = items[0]
formats = []
for f_id, f_url in item.get('files', {}).items():
if f_id == 'external':
return self.url_result(f_url)
ext, height = f_id.split('_')
height_extra_key = traverse_obj(video_params, ('video', 'partial', 'quality', height))
if height_extra_key:
formats.append({
'format_id': f'{height}p',
'url': f'{server_domain}/{f_url[8:]}&videos={video_id}&extra_key={height_extra_key}',
'height': int_or_none(height),
'ext': ext,
})
thumbnails = []
for k, v in item.items():
if k.startswith('photo_') and v:
width = k.replace('photo_', '')
thumbnails.append({
'id': width,
'url': v,
'width': int_or_none(width),
})
return {
'id': video_id,
'title': title,
'formats': formats,
'comment_count': int_or_none(item.get('comments')),
'description': description,
'duration': duration,
'thumbnails': thumbnails,
'timestamp': timestamp,
'view_count': views,
'age_limit': 18,
}

View File

@@ -1,37 +0,0 @@
from .common import InfoExtractor
class DefenseGouvFrIE(InfoExtractor):
IE_NAME = 'defense.gouv.fr'
_VALID_URL = r'https?://.*?\.defense\.gouv\.fr/layout/set/ligthboxvideo/base-de-medias/webtv/(?P<id>[^/?#]*)'
_TEST = {
'url': 'http://www.defense.gouv.fr/layout/set/ligthboxvideo/base-de-medias/webtv/attaque-chimique-syrienne-du-21-aout-2013-1',
'md5': '75bba6124da7e63d2d60b5244ec9430c',
'info_dict': {
'id': '11213',
'ext': 'mp4',
'title': 'attaque-chimique-syrienne-du-21-aout-2013-1'
}
}
def _real_extract(self, url):
title = self._match_id(url)
webpage = self._download_webpage(url, title)
video_id = self._search_regex(
r"flashvars.pvg_id=\"(\d+)\";",
webpage, 'ID')
json_url = (
'http://static.videos.gouv.fr/brightcovehub/export/json/%s' %
video_id)
info = self._download_json(json_url, title, 'Downloading JSON config')
video_url = info['renditions'][0]['url']
return {
'id': video_id,
'ext': 'mp4',
'url': video_url,
'title': title,
}

View File

@@ -3,6 +3,7 @@ from ..utils import parse_duration
class DHMIE(InfoExtractor): class DHMIE(InfoExtractor):
_WORKING = False
IE_DESC = 'Filmarchiv - Deutsches Historisches Museum' IE_DESC = 'Filmarchiv - Deutsches Historisches Museum'
_VALID_URL = r'https?://(?:www\.)?dhm\.de/filmarchiv/(?:[^/]+/)+(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?dhm\.de/filmarchiv/(?:[^/]+/)+(?P<id>[^/]+)'

View File

@@ -1,81 +0,0 @@
from .common import InfoExtractor
from ..utils import (
float_or_none,
int_or_none,
)
class DotsubIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?dotsub\.com/view/(?P<id>[^/]+)'
_TESTS = [{
'url': 'https://dotsub.com/view/9c63db2a-fa95-4838-8e6e-13deafe47f09',
'md5': '21c7ff600f545358134fea762a6d42b6',
'info_dict': {
'id': '9c63db2a-fa95-4838-8e6e-13deafe47f09',
'ext': 'flv',
'title': 'MOTIVATION - "It\'s Possible" Best Inspirational Video Ever',
'description': 'md5:41af1e273edbbdfe4e216a78b9d34ac6',
'thumbnail': 're:^https?://dotsub.com/media/9c63db2a-fa95-4838-8e6e-13deafe47f09/p',
'duration': 198,
'uploader': 'liuxt',
'timestamp': 1385778501.104,
'upload_date': '20131130',
'view_count': int,
}
}, {
'url': 'https://dotsub.com/view/747bcf58-bd59-45b7-8c8c-ac312d084ee6',
'md5': '2bb4a83896434d5c26be868c609429a3',
'info_dict': {
'id': '168006778',
'ext': 'mp4',
'title': 'Apartments and flats in Raipur the white symphony',
'description': 'md5:784d0639e6b7d1bc29530878508e38fe',
'thumbnail': 're:^https?://dotsub.com/media/747bcf58-bd59-45b7-8c8c-ac312d084ee6/p',
'duration': 290,
'timestamp': 1476767794.2809999,
'upload_date': '20161018',
'uploader': 'parthivi001',
'uploader_id': 'user52596202',
'view_count': int,
},
'add_ie': ['Vimeo'],
}]
def _real_extract(self, url):
video_id = self._match_id(url)
info = self._download_json(
'https://dotsub.com/api/media/%s/metadata' % video_id, video_id)
video_url = info.get('mediaURI')
if not video_url:
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
[r'<source[^>]+src="([^"]+)"', r'"file"\s*:\s*\'([^\']+)'],
webpage, 'video url', default=None)
info_dict = {
'id': video_id,
'url': video_url,
'ext': 'flv',
}
if not video_url:
setup_data = self._parse_json(self._html_search_regex(
r'(?s)data-setup=([\'"])(?P<content>(?!\1).+?)\1',
webpage, 'setup data', group='content'), video_id)
info_dict = {
'_type': 'url_transparent',
'url': setup_data['src'],
}
info_dict.update({
'title': info['title'],
'description': info.get('description'),
'thumbnail': info.get('screenshotURI'),
'duration': int_or_none(info.get('duration'), 1000),
'uploader': info.get('user'),
'timestamp': float_or_none(info.get('dateCreated'), 1000),
'view_count': int_or_none(info.get('numberOfViews')),
})
return info_dict

View File

@@ -209,7 +209,7 @@ class DRTVIE(InfoExtractor):
elif access_service == 'StandardVideo': elif access_service == 'StandardVideo':
preference = 1 preference = 1
fmts, subs = self._extract_m3u8_formats_and_subtitles( fmts, subs = self._extract_m3u8_formats_and_subtitles(
stream.get('url'), video_id, preference=preference, m3u8_id=format_id, fatal=False) stream.get('url'), video_id, ext='mp4', preference=preference, m3u8_id=format_id, fatal=False)
formats.extend(fmts) formats.extend(fmts)
api_subtitles = traverse_obj(stream, ('subtitles', lambda _, v: url_or_none(v['link']), {dict})) api_subtitles = traverse_obj(stream, ('subtitles', lambda _, v: url_or_none(v['link']), {dict}))

104
yt_dlp/extractor/duoplay.py Normal file
View File

@@ -0,0 +1,104 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
extract_attributes,
get_element_text_and_html_by_tag,
int_or_none,
join_nonempty,
str_or_none,
try_call,
unified_timestamp,
)
from ..utils.traversal import traverse_obj
class DuoplayIE(InfoExtractor):
_VALID_URL = r'https://duoplay\.ee/(?P<id>\d+)/[\w-]+/?(?:\?(?:[^#]+&)?ep=(?P<ep>\d+))?'
_TESTS = [{
'note': 'Siberi võmm S02E12',
'url': 'https://duoplay.ee/4312/siberi-vomm?ep=24',
'md5': '1ff59d535310ac9c5cf5f287d8f91b2d',
'info_dict': {
'id': '4312_24',
'ext': 'mp4',
'title': 'Operatsioon "Öö"',
'thumbnail': r're:https://.+\.jpg(?:\?c=\d+)?$',
'description': 'md5:8ef98f38569d6b8b78f3d350ccc6ade8',
'upload_date': '20170523',
'timestamp': 1495567800,
'series': 'Siberi võmm',
'series_id': '4312',
'season': 'Season 2',
'season_number': 2,
'episode': 'Operatsioon "Öö"',
'episode_number': 12,
'episode_id': 24,
},
}, {
'note': 'Empty title',
'url': 'https://duoplay.ee/17/uhikarotid?ep=14',
'md5': '6aca68be71112314738dd17cced7f8bf',
'info_dict': {
'id': '17_14',
'ext': 'mp4',
'title': 'Ühikarotid',
'thumbnail': r're:https://.+\.jpg(?:\?c=\d+)?$',
'description': 'md5:4719b418e058c209def41d48b601276e',
'upload_date': '20100916',
'timestamp': 1284661800,
'series': 'Ühikarotid',
'series_id': '17',
'season': 'Season 2',
'season_number': 2,
'episode_id': 14,
'release_year': 2010,
},
}, {
'note': 'Movie without expiry',
'url': 'https://duoplay.ee/5501/pilvede-all.-neljas-ode',
'md5': '7abf63d773a49ef7c39f2c127842b8fd',
'info_dict': {
'id': '5501',
'ext': 'mp4',
'title': 'Pilvede all. Neljas õde',
'thumbnail': r're:https://.+\.jpg(?:\?c=\d+)?$',
'description': 'md5:d86a70f8f31e82c369d4d4f4c79b1279',
'cast': 'count:9',
'upload_date': '20221214',
'timestamp': 1671054000,
'release_year': 2018,
},
}]
def _real_extract(self, url):
telecast_id, episode = self._match_valid_url(url).group('id', 'ep')
video_id = join_nonempty(telecast_id, episode, delim='_')
webpage = self._download_webpage(url, video_id)
video_player = try_call(lambda: extract_attributes(
get_element_text_and_html_by_tag('video-player', webpage)[1]))
if not video_player or not video_player.get('manifest-url'):
raise ExtractorError('No video found', expected=True)
episode_attr = self._parse_json(video_player.get(':episode') or '', video_id, fatal=False) or {}
return {
'id': video_id,
'formats': self._extract_m3u8_formats(video_player['manifest-url'], video_id, 'mp4'),
**traverse_obj(episode_attr, {
'title': 'title',
'description': 'synopsis',
'thumbnail': ('images', 'original'),
'timestamp': ('airtime', {lambda x: unified_timestamp(x + ' +0200')}),
'cast': ('cast', {lambda x: x.split(', ')}),
'release_year': ('year', {int_or_none}),
}),
**(traverse_obj(episode_attr, {
'title': (None, ('subtitle', ('episode_nr', {lambda x: f'Episode {x}' if x else None}))),
'series': 'title',
'series_id': ('telecast_id', {str_or_none}),
'season_number': ('season_id', {int_or_none}),
'episode': 'subtitle',
'episode_number': ('episode_nr', {int_or_none}),
'episode_id': ('episode_id', {int_or_none}),
}, get_all=False) if episode_attr.get('category') != 'movies' else {}),
}

View File

@@ -1,43 +0,0 @@
import re
from .common import InfoExtractor
class EchoMskIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?echo\.msk\.ru/sounds/(?P<id>\d+)'
_TEST = {
'url': 'http://www.echo.msk.ru/sounds/1464134.html',
'md5': '2e44b3b78daff5b458e4dbc37f191f7c',
'info_dict': {
'id': '1464134',
'ext': 'mp3',
'title': 'Особое мнение - 29 декабря 2014, 19:08',
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
audio_url = self._search_regex(
r'<a rel="mp3" href="([^"]+)">', webpage, 'audio URL')
title = self._html_search_regex(
r'<a href="/programs/[^"]+" target="_blank">([^<]+)</a>',
webpage, 'title')
air_date = self._html_search_regex(
r'(?s)<div class="date">(.+?)</div>',
webpage, 'date', fatal=False, default=None)
if air_date:
air_date = re.sub(r'(\s)\1+', r'\1', air_date)
if air_date:
title = '%s - %s' % (title, air_date)
return {
'id': video_id,
'url': audio_url,
'title': title,
}

View File

@@ -1,36 +0,0 @@
from .common import InfoExtractor
from ..compat import compat_urllib_parse_unquote
class EHowIE(InfoExtractor):
IE_NAME = 'eHow'
_VALID_URL = r'https?://(?:www\.)?ehow\.com/[^/_?]*_(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.ehow.com/video_12245069_hardwood-flooring-basics.html',
'md5': '9809b4e3f115ae2088440bcb4efbf371',
'info_dict': {
'id': '12245069',
'ext': 'flv',
'title': 'Hardwood Flooring Basics',
'description': 'Hardwood flooring may be time consuming, but its ultimately a pretty straightforward concept. Learn about hardwood flooring basics with help from a hardware flooring business owner in this free video...',
'uploader': 'Erick Nathan',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
r'(?:file|source)=(http[^\'"&]*)', webpage, 'video URL')
final_url = compat_urllib_parse_unquote(video_url)
uploader = self._html_search_meta('uploader', webpage)
title = self._og_search_title(webpage).replace(' | eHow', '')
return {
'id': video_id,
'url': final_url,
'title': title,
'thumbnail': self._og_search_thumbnail(webpage),
'description': self._og_search_description(webpage),
'uploader': uploader,
}

View File

@@ -1,59 +0,0 @@
from .common import InfoExtractor
from ..utils import (
parse_iso8601,
traverse_obj,
url_or_none,
)
class ElevenSportsIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?elevensports\.com/view/event/(?P<id>\w+)'
_TESTS = [{
'url': 'https://elevensports.com/view/event/clf46yr3kenn80jgrqsjmwefk',
'md5': 'c0958d9ff90e4503a75544358758921d',
'info_dict': {
'id': 'clf46yr3kenn80jgrqsjmwefk',
'title': 'Cleveland SC vs Lionsbridge FC',
'ext': 'mp4',
'description': 'md5:03b5238d6549f4ea1fddadf69b5e0b58',
'upload_date': '20230323',
'timestamp': 1679612400,
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
},
'params': {'skip_download': 'm3u8'}
}, {
'url': 'https://elevensports.com/view/event/clhpyd53b06160jez74qhgkmf',
'md5': 'c0958d9ff90e4503a75544358758921d',
'info_dict': {
'id': 'clhpyd53b06160jez74qhgkmf',
'title': 'AJNLF vs ARRAF',
'ext': 'mp4',
'description': 'md5:c8c5e75c78f37c6d15cd6c475e43a8c1',
'upload_date': '20230521',
'timestamp': 1684684800,
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
},
'params': {'skip_download': 'm3u8'}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
event_id = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['event']['mclsEventId']
event_data = self._download_json(
f'https://mcls-api.mycujoo.tv/bff/events/v1beta1/{event_id}', video_id,
headers={'Authorization': 'Bearer FBVKACGN37JQC5SFA0OVK8KKSIOP153G'})
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
event_data['streams'][0]['full_url'], video_id, 'mp4', m3u8_id='hls')
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(event_data, {
'title': ('title', {str}),
'description': ('description', {str}),
'timestamp': ('start_time', {parse_iso8601}),
'thumbnail': ('thumbnail_url', {url_or_none}),
}),
}

View File

@@ -1,130 +0,0 @@
from .common import InfoExtractor
from ..utils import (
clean_html,
extract_attributes,
float_or_none,
int_or_none,
try_get,
)
class EllenTubeBaseIE(InfoExtractor):
def _extract_data_config(self, webpage, video_id):
details = self._search_regex(
r'(<[^>]+\bdata-component=(["\'])[Dd]etails.+?></div>)', webpage,
'details')
return self._parse_json(
extract_attributes(details)['data-config'], video_id)
def _extract_video(self, data, video_id):
title = data['title']
formats = []
duration = None
for entry in data.get('media'):
if entry.get('id') == 'm3u8':
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
entry['url'], video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls')
duration = int_or_none(entry.get('duration'))
break
def get_insight(kind):
return int_or_none(try_get(
data, lambda x: x['insight']['%ss' % kind]))
return {
'extractor_key': EllenTubeIE.ie_key(),
'id': video_id,
'title': title,
'description': data.get('description'),
'duration': duration,
'thumbnail': data.get('thumbnail'),
'timestamp': float_or_none(data.get('publishTime'), scale=1000),
'view_count': get_insight('view'),
'like_count': get_insight('like'),
'formats': formats,
'subtitles': subtitles,
}
class EllenTubeIE(EllenTubeBaseIE):
_VALID_URL = r'''(?x)
(?:
ellentube:|
https://api-prod\.ellentube\.com/ellenapi/api/item/
)
(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})
'''
_TESTS = [{
'url': 'https://api-prod.ellentube.com/ellenapi/api/item/0822171c-3829-43bf-b99f-d77358ae75e3',
'md5': '2fabc277131bddafdd120e0fc0f974c9',
'info_dict': {
'id': '0822171c-3829-43bf-b99f-d77358ae75e3',
'ext': 'mp4',
'title': 'Ellen Meets Las Vegas Survivors Jesus Campos and Stephen Schuck',
'description': 'md5:76e3355e2242a78ad9e3858e5616923f',
'thumbnail': r're:^https?://.+?',
'duration': 514,
'timestamp': 1508505120,
'upload_date': '20171020',
'view_count': int,
'like_count': int,
}
}, {
'url': 'ellentube:734a3353-f697-4e79-9ca9-bfc3002dc1e0',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
'https://api-prod.ellentube.com/ellenapi/api/item/%s' % video_id,
video_id)
return self._extract_video(data, video_id)
class EllenTubeVideoIE(EllenTubeBaseIE):
_VALID_URL = r'https?://(?:www\.)?ellentube\.com/video/(?P<id>.+?)\.html'
_TEST = {
'url': 'https://www.ellentube.com/video/ellen-meets-las-vegas-survivors-jesus-campos-and-stephen-schuck.html',
'only_matching': True,
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._extract_data_config(webpage, display_id)['id']
return self.url_result(
'ellentube:%s' % video_id, ie=EllenTubeIE.ie_key(),
video_id=video_id)
class EllenTubePlaylistIE(EllenTubeBaseIE):
_VALID_URL = r'https?://(?:www\.)?ellentube\.com/(?:episode|studios)/(?P<id>.+?)\.html'
_TESTS = [{
'url': 'https://www.ellentube.com/episode/dax-shepard-jordan-fisher-haim.html',
'info_dict': {
'id': 'dax-shepard-jordan-fisher-haim',
'title': "Dax Shepard, 'DWTS' Team Jordan Fisher & Lindsay Arnold, HAIM",
'description': 'md5:bfc982194dabb3f4e325e43aa6b2e21c',
},
'playlist_count': 6,
}, {
'url': 'https://www.ellentube.com/studios/macey-goes-rving0.html',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
data = self._extract_data_config(webpage, display_id)['data']
feed = self._download_json(
'https://api-prod.ellentube.com/ellenapi/api/feed/?%s'
% data['filter'], display_id)
entries = [
self._extract_video(elem, elem['id'])
for elem in feed if elem.get('type') == 'VIDEO' and elem.get('id')]
return self.playlist_result(
entries, display_id, data.get('title'),
clean_html(data.get('description')))

View File

@@ -1,15 +0,0 @@
from .common import InfoExtractor
class EngadgetIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?engadget\.com/video/(?P<id>[^/?#]+)'
_TESTS = [{
# video with vidible ID
'url': 'https://www.engadget.com/video/57a28462134aa15a39f0421a/',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result('aol-video:%s' % video_id)

View File

@@ -0,0 +1,107 @@
from .common import InfoExtractor
from ..utils import (
float_or_none,
int_or_none,
orderedSet,
parse_iso8601,
parse_qs,
parse_resolution,
str_or_none,
traverse_obj,
url_or_none,
)
class EpidemicSoundIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?epidemicsound\.com/track/(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{
'url': 'https://www.epidemicsound.com/track/yFfQVRpSPz/',
'md5': 'd98ff2ddb49e8acab9716541cbc9dfac',
'info_dict': {
'id': '45014',
'display_id': 'yFfQVRpSPz',
'ext': 'mp3',
'title': 'Door Knock Door 1',
'alt_title': 'Door Knock Door 1',
'tags': ['foley', 'door', 'knock', 'glass', 'window', 'glass door knock'],
'categories': ['Misc. Door'],
'duration': 1,
'thumbnail': 'https://cdn.epidemicsound.com/curation-assets/commercial-release-cover-images/default-sfx/3000x3000.jpg',
'timestamp': 1415320353,
'upload_date': '20141107',
},
}, {
'url': 'https://www.epidemicsound.com/track/mj8GTTwsZd/',
'md5': 'c82b745890f9baf18dc2f8d568ee3830',
'info_dict': {
'id': '148700',
'display_id': 'mj8GTTwsZd',
'ext': 'mp3',
'title': 'Noplace',
'tags': ['liquid drum n bass', 'energetic'],
'categories': ['drum and bass'],
'duration': 237,
'timestamp': 1694426482,
'thumbnail': 'https://cdn.epidemicsound.com/curation-assets/commercial-release-cover-images/11138/3000x3000.jpg',
'upload_date': '20230911',
'release_timestamp': 1700535606,
'release_date': '20231121',
},
}]
@staticmethod
def _epidemic_parse_thumbnail(url: str):
if not url_or_none(url):
return None
return {
'url': url,
**(traverse_obj(url, ({parse_qs}, {
'width': ('width', 0, {int_or_none}),
'height': ('height', 0, {int_or_none}),
})) or parse_resolution(url)),
}
@staticmethod
def _epidemic_fmt_or_none(f):
if not f.get('format'):
f['format'] = f.get('format_id')
elif not f.get('format_id'):
f['format_id'] = f['format']
if not f['url'] or not f['format']:
return None
if f.get('format_note'):
f['format_note'] = f'track ID {f["format_note"]}'
if f['format'] != 'full':
f['preference'] = -2
return f
def _real_extract(self, url):
video_id = self._match_id(url)
json_data = self._download_json(f'https://www.epidemicsound.com/json/track/{video_id}', video_id)
thumbnails = traverse_obj(json_data, [('imageUrl', 'cover')])
thumb_base_url = traverse_obj(json_data, ('coverArt', 'baseUrl', {url_or_none}))
if thumb_base_url:
thumbnails.extend(traverse_obj(json_data, (
'coverArt', 'sizes', ..., {thumb_base_url.__add__})))
return traverse_obj(json_data, {
'id': ('id', {str_or_none}),
'display_id': ('publicSlug', {str}),
'title': ('title', {str}),
'alt_title': ('oldTitle', {str}),
'duration': ('length', {float_or_none}),
'timestamp': ('added', {parse_iso8601}),
'release_timestamp': ('releaseDate', {parse_iso8601}),
'categories': ('genres', ..., 'tag', {str}),
'tags': ('metadataTags', ..., {str}),
'age_limit': ('isExplicit', {lambda b: 18 if b else None}),
'thumbnails': ({lambda _: thumbnails}, {orderedSet}, ..., {self._epidemic_parse_thumbnail}),
'formats': ('stems', {dict.items}, ..., {
'format': (0, {str_or_none}),
'format_note': (1, 's3TrackId', {str_or_none}),
'format_id': (1, 'stemType', {str}),
'url': (1, 'lqMp3Url', {url_or_none}),
}, {self._epidemic_fmt_or_none}),
})

View File

@@ -1,15 +1,20 @@
import json
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
try_call, try_call,
unified_timestamp, unified_timestamp,
urlencode_postdata,
) )
class EplusIbIE(InfoExtractor): class EplusIbIE(InfoExtractor):
IE_NAME = 'eplus:inbound' _NETRC_MACHINE = 'eplus'
IE_DESC = 'e+ (イープラス) overseas' IE_NAME = 'eplus'
_VALID_URL = r'https?://live\.eplus\.jp/ex/player\?ib=(?P<id>(?:\w|%2B|%2F){86}%3D%3D)' IE_DESC = 'e+ (イープラス)'
_VALID_URL = [r'https?://live\.eplus\.jp/ex/player\?ib=(?P<id>(?:\w|%2B|%2F){86}%3D%3D)',
r'https?://live\.eplus\.jp/(?P<id>sample|\d+)']
_TESTS = [{ _TESTS = [{
'url': 'https://live.eplus.jp/ex/player?ib=YEFxb3Vyc2Dombnjg7blkrLlrablnJLjgrnjgq%2Fjg7zjg6vjgqLjgqTjg4njg6vlkIzlpb3kvJpgTGllbGxhIQ%3D%3D', 'url': 'https://live.eplus.jp/ex/player?ib=YEFxb3Vyc2Dombnjg7blkrLlrablnJLjgrnjgq%2Fjg7zjg6vjgqLjgqTjg4njg6vlkIzlpb3kvJpgTGllbGxhIQ%3D%3D',
'info_dict': { 'info_dict': {
@@ -29,14 +34,97 @@ class EplusIbIE(InfoExtractor):
'No video formats found!', 'No video formats found!',
'Requested format is not available', 'Requested format is not available',
], ],
}, {
'url': 'https://live.eplus.jp/sample',
'info_dict': {
'id': 'stream1ng20210719-test-005',
'title': 'Online streaming test for DRM',
'live_status': 'was_live',
'release_date': '20210719',
'release_timestamp': 1626703200,
'description': None,
},
'params': {
'skip_download': True,
'ignore_no_formats_error': True,
},
'expected_warnings': [
'Could not find the playlist URL. This event may not be accessible',
'No video formats found!',
'Requested format is not available',
'This video is DRM protected',
],
}, {
'url': 'https://live.eplus.jp/2053935',
'info_dict': {
'id': '331320-0001-001',
'title': '丘みどり2020配信LIVE Vol.2 ~秋麗~ 【Streaming+(配信チケット)】',
'live_status': 'was_live',
'release_date': '20200920',
'release_timestamp': 1600596000,
},
'params': {
'skip_download': True,
'ignore_no_formats_error': True,
},
'expected_warnings': [
'Could not find the playlist URL. This event may not be accessible',
'No video formats found!',
'Requested format is not available',
],
}] }]
_USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0'
def _login(self, username, password, urlh):
if not self._get_cookies('https://live.eplus.jp/').get('ci_session'):
raise ExtractorError('Unable to get ci_session cookie')
cltft_token = urlh.headers.get('X-CLTFT-Token')
if not cltft_token:
raise ExtractorError('Unable to get X-CLTFT-Token')
self._set_cookie('live.eplus.jp', 'X-CLTFT-Token', cltft_token)
login_json = self._download_json(
'https://live.eplus.jp/member/api/v1/FTAuth/idpw', None,
note='Sending pre-login info', errnote='Unable to send pre-login info', headers={
'Content-Type': 'application/json; charset=UTF-8',
'Referer': urlh.url,
'X-Cltft-Token': cltft_token,
'Accept': '*/*',
}, data=json.dumps({
'loginId': username,
'loginPassword': password,
}).encode())
if not login_json.get('isSuccess'):
raise ExtractorError('Login failed: Invalid id or password', expected=True)
self._request_webpage(
urlh.url, None, note='Logging in', errnote='Unable to log in',
data=urlencode_postdata({
'loginId': username,
'loginPassword': password,
'Token.Default': cltft_token,
'op': 'nextPage',
}), headers={'Referer': urlh.url})
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage, urlh = self._download_webpage_handle(
url, video_id, headers={'User-Agent': self._USER_AGENT})
if urlh.url.startswith('https://live.eplus.jp/member/auth'):
username, password = self._get_login_info()
if not username:
self.raise_login_required()
self._login(username, password, urlh)
webpage = self._download_webpage(
url, video_id, headers={'User-Agent': self._USER_AGENT})
data_json = self._search_json(r'<script>\s*var app\s*=', webpage, 'data json', video_id) data_json = self._search_json(r'<script>\s*var app\s*=', webpage, 'data json', video_id)
if data_json.get('drm_mode') == 'ON':
self.report_drm(video_id)
delivery_status = data_json.get('delivery_status') delivery_status = data_json.get('delivery_status')
archive_mode = data_json.get('archive_mode') archive_mode = data_json.get('archive_mode')
release_timestamp = try_call(lambda: unified_timestamp(data_json['event_datetime']) - 32400) release_timestamp = try_call(lambda: unified_timestamp(data_json['event_datetime']) - 32400)
@@ -64,7 +152,7 @@ class EplusIbIE(InfoExtractor):
formats = [] formats = []
m3u8_playlist_urls = self._search_json( m3u8_playlist_urls = self._search_json(
r'var listChannels\s*=', webpage, 'hls URLs', video_id, contains_pattern=r'\[.+\]', default=[]) r'var\s+listChannels\s*=', webpage, 'hls URLs', video_id, contains_pattern=r'\[.+\]', default=[])
if not m3u8_playlist_urls: if not m3u8_playlist_urls:
if live_status == 'is_upcoming': if live_status == 'is_upcoming':
self.raise_no_formats( self.raise_no_formats(

View File

@@ -1,108 +0,0 @@
from .common import InfoExtractor
from ..utils import (
determine_ext,
clean_html,
int_or_none,
float_or_none,
)
def _decrypt_config(key, string):
a = ''
i = ''
r = ''
while len(a) < (len(string) / 2):
a += key
a = a[0:int(len(string) / 2)]
t = 0
while t < len(string):
i += chr(int(string[t] + string[t + 1], 16))
t += 2
icko = [s for s in i]
for t, c in enumerate(a):
r += chr(ord(c) ^ ord(icko[t]))
return r
class EscapistIE(InfoExtractor):
_VALID_URL = r'https?://?(?:(?:www|v1)\.)?escapistmagazine\.com/videos/view/[^/]+/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.escapistmagazine.com/videos/view/the-escapist-presents/6618-Breaking-Down-Baldurs-Gate',
'md5': 'ab3a706c681efca53f0a35f1415cf0d1',
'info_dict': {
'id': '6618',
'ext': 'mp4',
'description': "Baldur's Gate: Original, Modded or Enhanced Edition? I'll break down what you can expect from the new Baldur's Gate: Enhanced Edition.",
'title': "Breaking Down Baldur's Gate",
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 264,
'uploader': 'The Escapist',
}
}, {
'url': 'http://www.escapistmagazine.com/videos/view/zero-punctuation/10044-Evolve-One-vs-Multiplayer',
'md5': '9e8c437b0dbb0387d3bd3255ca77f6bf',
'info_dict': {
'id': '10044',
'ext': 'mp4',
'description': 'This week, Zero Punctuation reviews Evolve.',
'title': 'Evolve - One vs Multiplayer',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 304,
'uploader': 'The Escapist',
}
}, {
'url': 'http://escapistmagazine.com/videos/view/the-escapist-presents/6618',
'only_matching': True,
}, {
'url': 'https://v1.escapistmagazine.com/videos/view/the-escapist-presents/6618-Breaking-Down-Baldurs-Gate',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
ims_video = self._parse_json(
self._search_regex(
r'imsVideo\.play\(({.+?})\);', webpage, 'imsVideo'),
video_id)
video_id = ims_video['videoID']
key = ims_video['hash']
config = self._download_webpage(
'http://www.escapistmagazine.com/videos/vidconfig.php',
video_id, 'Downloading video config', headers={
'Referer': url,
}, query={
'videoID': video_id,
'hash': key,
})
data = self._parse_json(_decrypt_config(key, config), video_id)
video_data = data['videoData']
title = clean_html(video_data['title'])
formats = [{
'url': video['src'],
'format_id': '%s-%sp' % (determine_ext(video['src']), video['res']),
'height': int_or_none(video.get('res')),
} for video in data['files']['videos']]
return {
'id': video_id,
'formats': formats,
'title': title,
'thumbnail': self._og_search_thumbnail(webpage) or data.get('poster'),
'description': self._og_search_description(webpage),
'duration': float_or_none(video_data.get('duration'), 1000),
'uploader': video_data.get('publisher'),
'series': video_data.get('show'),
}

View File

@@ -1,70 +0,0 @@
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
int_or_none,
parse_filesize,
unified_strdate,
)
class EsriVideoIE(InfoExtractor):
_VALID_URL = r'https?://video\.esri\.com/watch/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://video.esri.com/watch/1124/arcgis-online-_dash_-developing-applications',
'md5': 'd4aaf1408b221f1b38227a9bbaeb95bc',
'info_dict': {
'id': '1124',
'ext': 'mp4',
'title': 'ArcGIS Online - Developing Applications',
'description': 'Jeremy Bartley demonstrates how to develop applications with ArcGIS Online.',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 185,
'upload_date': '20120419',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
formats = []
for width, height, content in re.findall(
r'(?s)<li><strong>(\d+)x(\d+):</strong>(.+?)</li>', webpage):
for video_url, ext, filesize in re.findall(
r'<a[^>]+href="([^"]+)">([^<]+)&nbsp;\(([^<]+)\)</a>', content):
formats.append({
'url': compat_urlparse.urljoin(url, video_url),
'ext': ext.lower(),
'format_id': '%s-%s' % (ext.lower(), height),
'width': int(width),
'height': int(height),
'filesize_approx': parse_filesize(filesize),
})
title = self._html_search_meta('title', webpage, 'title')
description = self._html_search_meta(
'description', webpage, 'description', fatal=False)
thumbnail = self._html_search_meta('thumbnail', webpage, 'thumbnail', fatal=False)
if thumbnail:
thumbnail = re.sub(r'_[st]\.jpg$', '_x.jpg', thumbnail)
duration = int_or_none(self._search_regex(
[r'var\s+videoSeconds\s*=\s*(\d+)', r"'duration'\s*:\s*(\d+)"],
webpage, 'duration', fatal=False))
upload_date = unified_strdate(self._html_search_meta(
'last-modified', webpage, 'upload date', fatal=False))
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'upload_date': upload_date,
'formats': formats
}

View File

@@ -1,74 +0,0 @@
from .common import InfoExtractor
from ..utils import (
int_or_none,
unified_strdate,
)
class ExpoTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?expotv\.com/videos/[^?#]*/(?P<id>[0-9]+)($|[?#])'
_TEST = {
'url': 'http://www.expotv.com/videos/reviews/3/40/NYX-Butter-lipstick/667916',
'md5': 'fe1d728c3a813ff78f595bc8b7a707a8',
'info_dict': {
'id': '667916',
'ext': 'mp4',
'title': 'NYX Butter Lipstick Little Susie',
'description': 'Goes on like butter, but looks better!',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Stephanie S.',
'upload_date': '20150520',
'view_count': int,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player_key = self._search_regex(
r'<param name="playerKey" value="([^"]+)"', webpage, 'player key')
config = self._download_json(
'http://client.expotv.com/video/config/%s/%s' % (video_id, player_key),
video_id, 'Downloading video configuration')
formats = []
for fcfg in config['sources']:
media_url = fcfg.get('file')
if not media_url:
continue
if fcfg.get('type') == 'm3u8':
formats.extend(self._extract_m3u8_formats(
media_url, video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls'))
else:
formats.append({
'url': media_url,
'height': int_or_none(fcfg.get('height')),
'format_id': fcfg.get('label'),
'ext': self._search_regex(
r'filename=.*\.([a-z0-9_A-Z]+)&', media_url,
'file extension', default=None) or fcfg.get('type'),
})
title = self._og_search_title(webpage)
description = self._og_search_description(webpage)
thumbnail = config.get('image')
view_count = int_or_none(self._search_regex(
r'<h5>Plays: ([0-9]+)</h5>', webpage, 'view counts'))
uploader = self._search_regex(
r'<div class="reviewer">\s*<img alt="([^"]+)"', webpage, 'uploader',
fatal=False)
upload_date = unified_strdate(self._search_regex(
r'<h5>Reviewed on ([0-9/.]+)</h5>', webpage, 'upload date',
fatal=False), day_first=False)
return {
'id': video_id,
'formats': formats,
'title': title,
'description': description,
'view_count': view_count,
'thumbnail': thumbnail,
'uploader': uploader,
'upload_date': upload_date,
}

View File

@@ -1,48 +0,0 @@
from ..utils import str_to_int
from .keezmovies import KeezMoviesIE
class ExtremeTubeIE(KeezMoviesIE): # XXX: Do not subclass from concrete IE
_VALID_URL = r'https?://(?:www\.)?extremetube\.com/(?:[^/]+/)?video/(?P<id>[^/#?&]+)'
_TESTS = [{
'url': 'http://www.extremetube.com/video/music-video-14-british-euro-brit-european-cumshots-swallow-652431',
'md5': '92feaafa4b58e82f261e5419f39c60cb',
'info_dict': {
'id': 'music-video-14-british-euro-brit-european-cumshots-swallow-652431',
'ext': 'mp4',
'title': 'Music Video 14 british euro brit european cumshots swallow',
'uploader': 'anonim',
'view_count': int,
'age_limit': 18,
}
}, {
'url': 'http://www.extremetube.com/gay/video/abcde-1234',
'only_matching': True,
}, {
'url': 'http://www.extremetube.com/video/latina-slut-fucked-by-fat-black-dick',
'only_matching': True,
}, {
'url': 'http://www.extremetube.com/video/652431',
'only_matching': True,
}]
def _real_extract(self, url):
webpage, info = self._extract_info(url)
if not info['title']:
info['title'] = self._search_regex(
r'<h1[^>]+title="([^"]+)"[^>]*>', webpage, 'title')
uploader = self._html_search_regex(
r'Uploaded by:\s*</[^>]+>\s*<a[^>]+>(.+?)</a>',
webpage, 'uploader', fatal=False)
view_count = str_to_int(self._search_regex(
r'Views:\s*</[^>]+>\s*<[^>]+>([\d,\.]+)</',
webpage, 'view count', fatal=False))
info.update({
'uploader': uploader,
'view_count': view_count,
})
return info

View File

@@ -16,6 +16,7 @@ from ..utils import (
determine_ext, determine_ext,
error_to_compat_str, error_to_compat_str,
float_or_none, float_or_none,
format_field,
get_element_by_id, get_element_by_id,
get_first, get_first,
int_or_none, int_or_none,
@@ -51,7 +52,7 @@ class FacebookIE(InfoExtractor):
)\?(?:.*?)(?:v|video_id|story_fbid)=| )\?(?:.*?)(?:v|video_id|story_fbid)=|
[^/]+/videos/(?:[^/]+/)?| [^/]+/videos/(?:[^/]+/)?|
[^/]+/posts/| [^/]+/posts/|
groups/[^/]+/permalink/| groups/[^/]+/(?:permalink|posts)/|
watchparty/ watchparty/
)| )|
facebook: facebook:
@@ -231,6 +232,21 @@ class FacebookIE(InfoExtractor):
'uploader_id': '100013949973717', 'uploader_id': '100013949973717',
}, },
'skip': 'Requires logging in', 'skip': 'Requires logging in',
}, {
# data.node.comet_sections.content.story.attachments[].throwbackStyles.attachment_target_renderer.attachment.target.attachments[].styles.attachment.media
'url': 'https://www.facebook.com/groups/1645456212344334/posts/3737828833107051/',
'info_dict': {
'id': '1569199726448814',
'ext': 'mp4',
'title': 'Pence MUST GO!',
'description': 'Vickie Gentry shared a memory.',
'timestamp': 1511548260,
'upload_date': '20171124',
'uploader': 'Vickie Gentry',
'uploader_id': 'pfbid0FuZhHCeWDAxWxEbr3yKPFaRstXvRxgsp9uCPG6GjD4J2AitB35NUAuJ4Q75KcjiDl',
'thumbnail': r're:^https?://.*',
'duration': 148.435,
},
}, { }, {
'url': 'https://www.facebook.com/video.php?v=10204634152394104', 'url': 'https://www.facebook.com/video.php?v=10204634152394104',
'only_matching': True, 'only_matching': True,
@@ -420,6 +436,29 @@ class FacebookIE(InfoExtractor):
r'data-sjs>({.*?ScheduledServerJS.*?})</script>', webpage)] r'data-sjs>({.*?ScheduledServerJS.*?})</script>', webpage)]
post = traverse_obj(post_data, ( post = traverse_obj(post_data, (
..., 'require', ..., ..., ..., '__bbox', 'require', ..., ..., ..., '__bbox', 'result', 'data'), expected_type=dict) or [] ..., 'require', ..., ..., ..., '__bbox', 'require', ..., ..., ..., '__bbox', 'result', 'data'), expected_type=dict) or []
automatic_captions, subtitles = {}, {}
subs_data = traverse_obj(post, (..., 'video', ..., 'attachments', ..., lambda k, v: (
k == 'media' and str(v['id']) == video_id and v['__typename'] == 'Video')))
is_video_broadcast = get_first(subs_data, 'is_video_broadcast', expected_type=bool)
captions = get_first(subs_data, 'video_available_captions_locales', 'captions_url')
if url_or_none(captions): # if subs_data only had a 'captions_url'
locale = self._html_search_meta(['og:locale', 'twitter:locale'], webpage, 'locale', default='en_US')
subtitles[locale] = [{'url': captions}]
# or else subs_data had 'video_available_captions_locales', a list of dicts
for caption in traverse_obj(captions, (
{lambda x: sorted(x, key=lambda c: c['locale'])}, lambda _, v: v['captions_url'])
):
lang = caption.get('localized_language') or ''
subs = {
'url': caption['captions_url'],
'name': format_field(caption, 'localized_country', f'{lang} (%s)', default=lang),
}
if caption.get('localized_creation_method') or is_video_broadcast:
automatic_captions.setdefault(caption['locale'], []).append(subs)
else:
subtitles.setdefault(caption['locale'], []).append(subs)
media = traverse_obj(post, (..., 'attachments', ..., lambda k, v: ( media = traverse_obj(post, (..., 'attachments', ..., lambda k, v: (
k == 'media' and str(v['id']) == video_id and v['__typename'] == 'Video')), expected_type=dict) k == 'media' and str(v['id']) == video_id and v['__typename'] == 'Video')), expected_type=dict)
title = get_first(media, ('title', 'text')) title = get_first(media, ('title', 'text'))
@@ -463,6 +502,8 @@ class FacebookIE(InfoExtractor):
webpage, 'view count', default=None)), webpage, 'view count', default=None)),
'concurrent_view_count': get_first(post, ( 'concurrent_view_count': get_first(post, (
('video', (..., ..., 'attachments', ..., 'media')), 'liveViewerCount', {int_or_none})), ('video', (..., ..., 'attachments', ..., 'media')), 'liveViewerCount', {int_or_none})),
'automatic_captions': automatic_captions,
'subtitles': subtitles,
} }
info_json_ld = self._search_json_ld(webpage, video_id, default={}) info_json_ld = self._search_json_ld(webpage, video_id, default={})
@@ -586,9 +627,11 @@ class FacebookIE(InfoExtractor):
nodes = variadic(traverse_obj(data, 'nodes', 'node') or []) nodes = variadic(traverse_obj(data, 'nodes', 'node') or [])
attachments = traverse_obj(nodes, ( attachments = traverse_obj(nodes, (
..., 'comet_sections', 'content', 'story', (None, 'attached_story'), 'attachments', ..., 'comet_sections', 'content', 'story', (None, 'attached_story'), 'attachments',
..., ('styles', 'style_type_renderer'), 'attachment'), expected_type=dict) or [] ..., ('styles', 'style_type_renderer', ('throwbackStyles', 'attachment_target_renderer')),
'attachment', {dict}))
for attachment in attachments: for attachment in attachments:
ns = try_get(attachment, lambda x: x['all_subattachments']['nodes'], list) or [] ns = traverse_obj(attachment, ('all_subattachments', 'nodes', ..., {dict}),
('target', 'attachments', ..., 'styles', 'attachment', {dict}))
for n in ns: for n in ns:
parse_attachment(n) parse_attachment(n)
parse_attachment(attachment) parse_attachment(attachment)
@@ -611,7 +654,7 @@ class FacebookIE(InfoExtractor):
if len(entries) > 1: if len(entries) > 1:
return self.playlist_result(entries, video_id) return self.playlist_result(entries, video_id)
video_info = entries[0] video_info = entries[0] if entries else {'id': video_id}
webpage_info = extract_metadata(webpage) webpage_info = extract_metadata(webpage)
# honor precise duration in video info # honor precise duration in video info
if video_info.get('duration'): if video_info.get('duration'):

View File

@@ -2,11 +2,9 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_parse_qs from ..compat import compat_parse_qs
from ..dependencies import websockets
from ..networking import Request from ..networking import Request
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
WebSocketsWrapper,
js_to_json, js_to_json,
traverse_obj, traverse_obj,
update_url_query, update_url_query,
@@ -167,8 +165,6 @@ class FC2LiveIE(InfoExtractor):
}] }]
def _real_extract(self, url): def _real_extract(self, url):
if not websockets:
raise ExtractorError('websockets library is not available. Please install it.', expected=True)
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage('https://live.fc2.com/%s/' % video_id, video_id) webpage = self._download_webpage('https://live.fc2.com/%s/' % video_id, video_id)
@@ -199,13 +195,9 @@ class FC2LiveIE(InfoExtractor):
ws_url = update_url_query(control_server['url'], {'control_token': control_server['control_token']}) ws_url = update_url_query(control_server['url'], {'control_token': control_server['control_token']})
playlist_data = None playlist_data = None
self.to_screen('%s: Fetching HLS playlist info via WebSocket' % video_id) ws = self._request_webpage(Request(ws_url, headers={
ws = WebSocketsWrapper(ws_url, {
'Cookie': str(self._get_cookies('https://live.fc2.com/'))[12:],
'Origin': 'https://live.fc2.com', 'Origin': 'https://live.fc2.com',
'Accept': '*/*', }), video_id, note='Fetching HLS playlist info via WebSocket')
'User-Agent': self.get_param('http_headers')['User-Agent'],
})
self.write_debug('Sending HLS server request') self.write_debug('Sending HLS server request')

View File

@@ -0,0 +1,268 @@
import functools
from .common import InfoExtractor
from ..utils import (
ExtractorError,
OnDemandPagedList,
clean_html,
determine_ext,
format_field,
int_or_none,
join_nonempty,
parse_codecs,
parse_iso8601,
urljoin,
)
from ..utils.traversal import traverse_obj
class FloatplaneIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|beta)\.)?floatplane\.com/post/(?P<id>\w+)'
_TESTS = [{
'url': 'https://www.floatplane.com/post/2Yf3UedF7C',
'info_dict': {
'id': 'yuleLogLTT',
'ext': 'mp4',
'display_id': '2Yf3UedF7C',
'title': '8K Yule Log Fireplace with Crackling Fire Sounds - 10 Hours',
'description': 'md5:adf2970e0de1c5e3df447818bb0309f6',
'thumbnail': r're:^https?://.*\.jpe?g$',
'duration': 36035,
'comment_count': int,
'like_count': int,
'dislike_count': int,
'release_date': '20191206',
'release_timestamp': 1575657000,
'uploader': 'LinusTechTips',
'uploader_id': '59f94c0bdd241b70349eb72b',
'uploader_url': 'https://www.floatplane.com/channel/linustechtips/home',
'channel': 'Linus Tech Tips',
'channel_id': '63fe42c309e691e4e36de93d',
'channel_url': 'https://www.floatplane.com/channel/linustechtips/home/main',
'availability': 'subscriber_only',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.floatplane.com/post/j2jqG3JmgJ',
'info_dict': {
'id': 'j2jqG3JmgJ',
'title': 'TJM: Does Anyone Care About Avatar: The Way of Water?',
'description': 'md5:00bf17dc5733e4031e99b7fd6489f274',
'thumbnail': r're:^https?://.*\.jpe?g$',
'comment_count': int,
'like_count': int,
'dislike_count': int,
'release_timestamp': 1671915900,
'release_date': '20221224',
'uploader': 'LinusTechTips',
'uploader_id': '59f94c0bdd241b70349eb72b',
'uploader_url': 'https://www.floatplane.com/channel/linustechtips/home',
'channel': "They're Just Movies",
'channel_id': '64135f82fc76ab7f9fbdc876',
'channel_url': 'https://www.floatplane.com/channel/linustechtips/home/tajm',
'availability': 'subscriber_only',
},
'playlist_count': 2,
}, {
'url': 'https://www.floatplane.com/post/3tK2tInhoN',
'info_dict': {
'id': '3tK2tInhoN',
'title': 'Extras - How Linus Communicates with Editors (Compensator 4)',
'description': 'md5:83cd40aae1ce124df33769600c80ca5b',
'thumbnail': r're:^https?://.*\.jpe?g$',
'comment_count': int,
'like_count': int,
'dislike_count': int,
'release_timestamp': 1700529120,
'release_date': '20231121',
'uploader': 'LinusTechTips',
'uploader_id': '59f94c0bdd241b70349eb72b',
'uploader_url': 'https://www.floatplane.com/channel/linustechtips/home',
'channel': 'FP Exclusives',
'channel_id': '6413623f5b12cca228a28e78',
'channel_url': 'https://www.floatplane.com/channel/linustechtips/home/fpexclusive',
'availability': 'subscriber_only',
},
'playlist_count': 2,
}, {
'url': 'https://beta.floatplane.com/post/d870PEFXS1',
'info_dict': {
'id': 'bg9SuYKEww',
'ext': 'mp4',
'display_id': 'd870PEFXS1',
'title': 'LCS Drama, TLOU 2 Remaster, Destiny 2 Player Count Drops, + More!',
'description': 'md5:80d612dcabf41b17487afcbe303ec57d',
'thumbnail': r're:^https?://.*\.jpe?g$',
'release_timestamp': 1700622000,
'release_date': '20231122',
'duration': 513,
'like_count': int,
'dislike_count': int,
'comment_count': int,
'uploader': 'LinusTechTips',
'uploader_id': '59f94c0bdd241b70349eb72b',
'uploader_url': 'https://www.floatplane.com/channel/linustechtips/home',
'channel': 'GameLinked',
'channel_id': '649dbade3540dbc3945eeda7',
'channel_url': 'https://www.floatplane.com/channel/linustechtips/home/gamelinked',
'availability': 'subscriber_only',
},
'params': {'skip_download': 'm3u8'},
}]
def _real_initialize(self):
if not self._get_cookies('https://www.floatplane.com').get('sails.sid'):
self.raise_login_required()
def _real_extract(self, url):
post_id = self._match_id(url)
post_data = self._download_json(
'https://www.floatplane.com/api/v3/content/post', post_id, query={'id': post_id},
note='Downloading post data', errnote='Unable to download post data')
if not any(traverse_obj(post_data, ('metadata', ('hasVideo', 'hasAudio')))):
raise ExtractorError('Post does not contain a video or audio track', expected=True)
items = []
for media in traverse_obj(post_data, (('videoAttachments', 'audioAttachments'), ...)):
media_id = media['id']
media_typ = media.get('type') or 'video'
metadata = self._download_json(
f'https://www.floatplane.com/api/v3/content/{media_typ}', media_id, query={'id': media_id},
note=f'Downloading {media_typ} metadata')
stream = self._download_json(
'https://www.floatplane.com/api/v2/cdn/delivery', media_id, query={
'type': 'vod' if media_typ == 'video' else 'aod',
'guid': metadata['guid']
}, note=f'Downloading {media_typ} stream data')
path_template = traverse_obj(stream, ('resource', 'uri', {str}))
def format_path(params):
path = path_template
for i, val in (params or {}).items():
path = path.replace(f'{{qualityLevelParams.{i}}}', val)
return path
formats = []
for quality in traverse_obj(stream, ('resource', 'data', 'qualityLevels', ...)):
url = urljoin(stream['cdn'], format_path(traverse_obj(
stream, ('resource', 'data', 'qualityLevelParams', quality['name']))))
formats.append({
**traverse_obj(quality, {
'format_id': 'name',
'format_note': 'label',
'width': ('width', {int}),
'height': ('height', {int}),
}),
**parse_codecs(quality.get('codecs')),
'url': url,
'ext': determine_ext(url.partition('/chunk.m3u8')[0], 'mp4'),
})
items.append({
'id': media_id,
**traverse_obj(metadata, {
'title': 'title',
'duration': ('duration', {int_or_none}),
'thumbnail': ('thumbnail', 'path'),
}),
'formats': formats,
})
uploader_url = format_field(
post_data, [('creator', 'urlname')], 'https://www.floatplane.com/channel/%s/home') or None
channel_url = urljoin(f'{uploader_url}/', traverse_obj(post_data, ('channel', 'urlname')))
post_info = {
'id': post_id,
'display_id': post_id,
**traverse_obj(post_data, {
'title': 'title',
'description': ('text', {clean_html}),
'uploader': ('creator', 'title'),
'uploader_id': ('creator', 'id'),
'channel': ('channel', 'title'),
'channel_id': ('channel', 'id'),
'like_count': ('likes', {int_or_none}),
'dislike_count': ('dislikes', {int_or_none}),
'comment_count': ('comments', {int_or_none}),
'release_timestamp': ('releaseDate', {parse_iso8601}),
'thumbnail': ('thumbnail', 'path'),
}),
'uploader_url': uploader_url,
'channel_url': channel_url,
'availability': self._availability(needs_subscription=True),
}
if len(items) > 1:
return self.playlist_result(items, **post_info)
post_info.update(items[0])
return post_info
class FloatplaneChannelIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|beta)\.)?floatplane\.com/channel/(?P<id>[\w-]+)/home(?:/(?P<channel>[\w-]+))?'
_PAGE_SIZE = 20
_TESTS = [{
'url': 'https://www.floatplane.com/channel/linustechtips/home/ltxexpo',
'info_dict': {
'id': 'linustechtips/ltxexpo',
'title': 'LTX Expo',
'description': 'md5:9819002f9ebe7fd7c75a3a1d38a59149',
},
'playlist_mincount': 51,
}, {
'url': 'https://www.floatplane.com/channel/ShankMods/home',
'info_dict': {
'id': 'ShankMods',
'title': 'Shank Mods',
'description': 'md5:6dff1bb07cad8e5448e04daad9be1b30',
},
'playlist_mincount': 14,
}, {
'url': 'https://beta.floatplane.com/channel/bitwit_ultra/home',
'info_dict': {
'id': 'bitwit_ultra',
'title': 'Bitwit Ultra',
'description': 'md5:1452f280bb45962976d4789200f676dd',
},
'playlist_mincount': 200,
}]
def _fetch_page(self, display_id, creator_id, channel_id, page):
query = {
'id': creator_id,
'limit': self._PAGE_SIZE,
'fetchAfter': page * self._PAGE_SIZE,
}
if channel_id:
query['channel'] = channel_id
page_data = self._download_json(
'https://www.floatplane.com/api/v3/content/creator', display_id,
query=query, note=f'Downloading page {page + 1}')
for post in page_data or []:
yield self.url_result(
f'https://www.floatplane.com/post/{post["id"]}',
FloatplaneIE, id=post['id'], title=post.get('title'),
release_timestamp=parse_iso8601(post.get('releaseDate')))
def _real_extract(self, url):
creator, channel = self._match_valid_url(url).group('id', 'channel')
display_id = join_nonempty(creator, channel, delim='/')
creator_data = self._download_json(
'https://www.floatplane.com/api/v3/creator/named',
display_id, query={'creatorURL[0]': creator})[0]
channel_data = traverse_obj(
creator_data, ('channels', lambda _, v: v['urlname'] == channel), get_all=False) or {}
return self.playlist_result(OnDemandPagedList(functools.partial(
self._fetch_page, display_id, creator_data['id'], channel_data.get('id')), self._PAGE_SIZE),
display_id, title=channel_data.get('title') or creator_data.get('title'),
description=channel_data.get('about') or creator_data.get('about'))

View File

@@ -1,106 +0,0 @@
from .common import InfoExtractor
from ..utils import traverse_obj, unified_timestamp
class FourZeroStudioArchiveIE(InfoExtractor):
_VALID_URL = r'https?://0000\.studio/(?P<uploader_id>[^/]+)/broadcasts/(?P<id>[^/]+)/archive'
IE_NAME = '0000studio:archive'
_TESTS = [{
'url': 'https://0000.studio/mumeijiten/broadcasts/1290f433-fce0-4909-a24a-5f7df09665dc/archive',
'info_dict': {
'id': '1290f433-fce0-4909-a24a-5f7df09665dc',
'title': 'noteで『canape』様へのファンレターを執筆します。数秘術その2',
'timestamp': 1653802534,
'release_timestamp': 1653796604,
'thumbnails': 'count:1',
'comments': 'count:7',
'uploader': '『中崎雄心』の執務室。',
'uploader_id': 'mumeijiten',
}
}]
def _real_extract(self, url):
video_id, uploader_id = self._match_valid_url(url).group('id', 'uploader_id')
webpage = self._download_webpage(url, video_id)
nuxt_data = self._search_nuxt_data(webpage, video_id, traverse=None)
pcb = traverse_obj(nuxt_data, ('ssrRefs', lambda _, v: v['__typename'] == 'PublicCreatorBroadcast'), get_all=False)
uploader_internal_id = traverse_obj(nuxt_data, (
'ssrRefs', lambda _, v: v['__typename'] == 'PublicUser', 'id'), get_all=False)
formats, subs = self._extract_m3u8_formats_and_subtitles(pcb['archiveUrl'], video_id, ext='mp4')
return {
'id': video_id,
'title': pcb.get('title'),
'age_limit': 18 if pcb.get('isAdult') else None,
'timestamp': unified_timestamp(pcb.get('finishTime')),
'release_timestamp': unified_timestamp(pcb.get('createdAt')),
'thumbnails': [{
'url': pcb['thumbnailUrl'],
'ext': 'png',
}] if pcb.get('thumbnailUrl') else None,
'formats': formats,
'subtitles': subs,
'comments': [{
'author': c.get('username'),
'author_id': c.get('postedUserId'),
'author_thumbnail': c.get('userThumbnailUrl'),
'id': c.get('id'),
'text': c.get('body'),
'timestamp': unified_timestamp(c.get('createdAt')),
'like_count': c.get('likeCount'),
'is_favorited': c.get('isLikedByOwner'),
'author_is_uploader': c.get('postedUserId') == uploader_internal_id,
} for c in traverse_obj(nuxt_data, (
'ssrRefs', ..., lambda _, v: v['__typename'] == 'PublicCreatorBroadcastComment')) or []],
'uploader_id': uploader_id,
'uploader': traverse_obj(nuxt_data, (
'ssrRefs', lambda _, v: v['__typename'] == 'PublicUser', 'username'), get_all=False),
}
class FourZeroStudioClipIE(InfoExtractor):
_VALID_URL = r'https?://0000\.studio/(?P<uploader_id>[^/]+)/archive-clip/(?P<id>[^/]+)'
IE_NAME = '0000studio:clip'
_TESTS = [{
'url': 'https://0000.studio/soeji/archive-clip/e46b0278-24cd-40a8-92e1-b8fc2b21f34f',
'info_dict': {
'id': 'e46b0278-24cd-40a8-92e1-b8fc2b21f34f',
'title': 'わたベーさんからイラスト差し入れいただきました。ありがとうございました!',
'timestamp': 1652109105,
'like_count': 1,
'uploader': 'ソエジマケイタ',
'uploader_id': 'soeji',
}
}]
def _real_extract(self, url):
video_id, uploader_id = self._match_valid_url(url).group('id', 'uploader_id')
webpage = self._download_webpage(url, video_id)
nuxt_data = self._search_nuxt_data(webpage, video_id, traverse=None)
clip_info = traverse_obj(nuxt_data, ('ssrRefs', lambda _, v: v['__typename'] == 'PublicCreatorArchivedClip'), get_all=False)
info = next((
m for m in self._parse_html5_media_entries(url, webpage, video_id)
if 'mp4' in traverse_obj(m, ('formats', ..., 'ext'))
), None)
if not info:
self.report_warning('Failed to find a desired media element. Falling back to using NUXT data.')
info = {
'formats': [{
'ext': 'mp4',
'url': url,
} for url in clip_info.get('mediaFiles') or [] if url],
}
return {
**info,
'id': video_id,
'title': clip_info.get('clipComment'),
'timestamp': unified_timestamp(clip_info.get('createdAt')),
'like_count': clip_info.get('likeCount'),
'uploader_id': uploader_id,
'uploader': traverse_obj(nuxt_data, (
'ssrRefs', lambda _, v: v['__typename'] == 'PublicUser', 'username'), get_all=False),
}

View File

@@ -1,58 +0,0 @@
import itertools
from .common import InfoExtractor
from ..utils import (
get_element_by_id,
int_or_none,
remove_end,
)
class FoxgayIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?foxgay\.com/videos/(?:\S+-)?(?P<id>\d+)\.shtml'
_TEST = {
'url': 'http://foxgay.com/videos/fuck-turkish-style-2582.shtml',
'md5': '344558ccfea74d33b7adbce22e577f54',
'info_dict': {
'id': '2582',
'ext': 'mp4',
'title': 'Fuck Turkish-style',
'description': 'md5:6ae2d9486921891efe89231ace13ffdf',
'age_limit': 18,
'thumbnail': r're:https?://.*\.jpg$',
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = remove_end(self._html_extract_title(webpage), ' - Foxgay.com')
description = get_element_by_id('inf_tit', webpage)
# The default user-agent with foxgay cookies leads to pages without videos
self.cookiejar.clear('.foxgay.com')
# Find the URL for the iFrame which contains the actual video.
iframe_url = self._html_search_regex(
r'<iframe[^>]+src=([\'"])(?P<url>[^\'"]+)\1', webpage,
'video frame', group='url')
iframe = self._download_webpage(
iframe_url, video_id, headers={'User-Agent': 'curl/7.50.1'},
note='Downloading video frame')
video_data = self._parse_json(self._search_regex(
r'video_data\s*=\s*([^;]+);', iframe, 'video data'), video_id)
formats = [{
'url': source,
'height': int_or_none(resolution),
} for source, resolution in zip(
video_data['sources'], video_data.get('resolutions', itertools.repeat(None)))]
return {
'id': video_id,
'title': title,
'formats': formats,
'description': description,
'thumbnail': video_data.get('act_vid', {}).get('thumb'),
'age_limit': 18,
}

View File

@@ -1,12 +1,14 @@
from .common import InfoExtractor from .common import InfoExtractor
from .dailymotion import DailymotionIE
from ..utils import ( from ..utils import (
determine_ext,
ExtractorError, ExtractorError,
determine_ext,
format_field, format_field,
int_or_none,
join_nonempty,
parse_iso8601, parse_iso8601,
parse_qs, parse_qs,
) )
from .dailymotion import DailymotionIE
class FranceTVBaseInfoExtractor(InfoExtractor): class FranceTVBaseInfoExtractor(InfoExtractor):
@@ -82,6 +84,8 @@ class FranceTVIE(InfoExtractor):
videos = [] videos = []
title = None title = None
subtitle = None subtitle = None
episode_number = None
season_number = None
image = None image = None
duration = None duration = None
timestamp = None timestamp = None
@@ -112,7 +116,9 @@ class FranceTVIE(InfoExtractor):
if meta: if meta:
if title is None: if title is None:
title = meta.get('title') title = meta.get('title')
# XXX: what is meta['pre_title']? # meta['pre_title'] contains season and episode number for series in format "S<ID> E<ID>"
season_number, episode_number = self._search_regex(
r'S(\d+)\s*E(\d+)', meta.get('pre_title'), 'episode info', group=(1, 2), default=(None, None))
if subtitle is None: if subtitle is None:
subtitle = meta.get('additional_title') subtitle = meta.get('additional_title')
if image is None: if image is None:
@@ -191,19 +197,19 @@ class FranceTVIE(InfoExtractor):
} for sheet in spritesheets] } for sheet in spritesheets]
}) })
if subtitle:
title += ' - %s' % subtitle
title = title.strip()
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': join_nonempty(title, subtitle, delim=' - ').strip(),
'thumbnail': image, 'thumbnail': image,
'duration': duration, 'duration': duration,
'timestamp': timestamp, 'timestamp': timestamp,
'is_live': is_live, 'is_live': is_live,
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
'episode': subtitle if episode_number else None,
'series': title if episode_number else None,
'episode_number': int_or_none(episode_number),
'season_number': int_or_none(season_number),
} }
def _real_extract(self, url): def _real_extract(self, url):
@@ -230,14 +236,31 @@ class FranceTVSiteIE(FranceTVBaseInfoExtractor):
'id': 'ec217ecc-0733-48cf-ac06-af1347b849d1', 'id': 'ec217ecc-0733-48cf-ac06-af1347b849d1',
'ext': 'mp4', 'ext': 'mp4',
'title': '13h15, le dimanche... - Les mystères de Jésus', 'title': '13h15, le dimanche... - Les mystères de Jésus',
'description': 'md5:75efe8d4c0a8205e5904498ffe1e1a42',
'timestamp': 1502623500, 'timestamp': 1502623500,
'duration': 2580,
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20170813', 'upload_date': '20170813',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
'add_ie': [FranceTVIE.ie_key()], 'add_ie': [FranceTVIE.ie_key()],
}, {
'url': 'https://www.france.tv/enfants/six-huit-ans/foot2rue/saison-1/3066387-duel-au-vieux-port.html',
'info_dict': {
'id': 'a9050959-eedd-4b4a-9b0d-de6eeaa73e44',
'ext': 'mp4',
'title': 'Foot2Rue - Duel au vieux port',
'episode': 'Duel au vieux port',
'series': 'Foot2Rue',
'episode_number': 1,
'season_number': 1,
'timestamp': 1642761360,
'upload_date': '20220121',
'season': 'Season 1',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1441,
},
}, { }, {
# france3 # france3
'url': 'https://www.france.tv/france-3/des-chiffres-et-des-lettres/139063-emission-du-mardi-9-mai-2017.html', 'url': 'https://www.france.tv/france-3/des-chiffres-et-des-lettres/139063-emission-du-mardi-9-mai-2017.html',

View File

@@ -1,81 +0,0 @@
from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
mimetype2ext,
parse_iso8601,
)
class FusionIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?fusion\.(?:net|tv)/(?:video/|show/.+?\bvideo=)(?P<id>\d+)'
_TESTS = [{
'url': 'http://fusion.tv/video/201781/u-s-and-panamanian-forces-work-together-to-stop-a-vessel-smuggling-drugs/',
'info_dict': {
'id': '3145868',
'ext': 'mp4',
'title': 'U.S. and Panamanian forces work together to stop a vessel smuggling drugs',
'description': 'md5:0cc84a9943c064c0f46b128b41b1b0d7',
'duration': 140.0,
'timestamp': 1442589635,
'uploader': 'UNIVISON',
'upload_date': '20150918',
},
'params': {
'skip_download': True,
},
'add_ie': ['Anvato'],
}, {
'url': 'http://fusion.tv/video/201781',
'only_matching': True,
}, {
'url': 'https://fusion.tv/show/food-exposed-with-nelufar-hedayat/?ancla=full-episodes&video=588644',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_json(
'https://platform.fusion.net/wp-json/fusiondotnet/v1/video/' + video_id, video_id)
info = {
'id': video_id,
'title': video['title'],
'description': video.get('excerpt'),
'timestamp': parse_iso8601(video.get('published')),
'series': video.get('show'),
}
formats = []
src = video.get('src') or {}
for f_id, f in src.items():
for q_id, q in f.items():
q_url = q.get('url')
if not q_url:
continue
ext = determine_ext(q_url, mimetype2ext(q.get('type')))
if ext == 'smil':
formats.extend(self._extract_smil_formats(q_url, video_id, fatal=False))
elif f_id == 'm3u8-variant' or (ext == 'm3u8' and q_id == 'Variant'):
formats.extend(self._extract_m3u8_formats(
q_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
else:
formats.append({
'format_id': '-'.join([f_id, q_id]),
'url': q_url,
'width': int_or_none(q.get('width')),
'height': int_or_none(q.get('height')),
'tbr': int_or_none(self._search_regex(r'_(\d+)\.m(?:p4|3u8)', q_url, 'bitrate')),
'ext': 'mp4' if ext == 'm3u8' else ext,
'protocol': 'm3u8_native' if ext == 'm3u8' else 'https',
})
if formats:
info['formats'] = formats
else:
info.update({
'_type': 'url',
'url': 'anvato:uni:' + video['video_ids']['anvato'],
'ie_key': 'Anvato',
})
return info

View File

@@ -35,8 +35,8 @@ from ..utils import (
unified_timestamp, unified_timestamp,
unsmuggle_url, unsmuggle_url,
update_url_query, update_url_query,
urlhandle_detect_ext,
url_or_none, url_or_none,
urlhandle_detect_ext,
urljoin, urljoin,
variadic, variadic,
xpath_attr, xpath_attr,
@@ -374,46 +374,6 @@ class GenericIE(InfoExtractor):
}, },
'skip': 'There is a limit of 200 free downloads / month for the test song', 'skip': 'There is a limit of 200 free downloads / month for the test song',
}, },
# ooyala video
{
'url': 'http://www.rollingstone.com/music/videos/norwegian-dj-cashmere-cat-goes-spartan-on-with-me-premiere-20131219',
'md5': '166dd577b433b4d4ebfee10b0824d8ff',
'info_dict': {
'id': 'BwY2RxaTrTkslxOfcan0UCf0YqyvWysJ',
'ext': 'mp4',
'title': '2cc213299525360.mov', # that's what we get
'duration': 238.231,
},
'add_ie': ['Ooyala'],
},
{
# ooyala video embedded with http://player.ooyala.com/iframe.js
'url': 'http://www.macrumors.com/2015/07/24/steve-jobs-the-man-in-the-machine-first-trailer/',
'info_dict': {
'id': 'p0MGJndjoG5SOKqO_hZJuZFPB-Tr5VgB',
'ext': 'mp4',
'title': '"Steve Jobs: Man in the Machine" trailer',
'description': 'The first trailer for the Alex Gibney documentary "Steve Jobs: Man in the Machine."',
'duration': 135.427,
},
'params': {
'skip_download': True,
},
'skip': 'movie expired',
},
# ooyala video embedded with http://player.ooyala.com/static/v4/production/latest/core.min.js
{
'url': 'http://wnep.com/2017/07/22/steampunk-fest-comes-to-honesdale/',
'info_dict': {
'id': 'lwYWYxYzE6V5uJMjNGyKtwwiw9ZJD7t2',
'ext': 'mp4',
'title': 'Steampunk Fest Comes to Honesdale',
'duration': 43.276,
},
'params': {
'skip_download': True,
}
},
# embed.ly video # embed.ly video
{ {
'url': 'http://www.tested.com/science/weird/460206-tested-grinding-coffee-2000-frames-second/', 'url': 'http://www.tested.com/science/weird/460206-tested-grinding-coffee-2000-frames-second/',
@@ -506,7 +466,8 @@ class GenericIE(InfoExtractor):
'title': 'Ужастики, русский трейлер (2015)', 'title': 'Ужастики, русский трейлер (2015)',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 153, 'duration': 153,
} },
'skip': 'Site dead',
}, },
# XHamster embed # XHamster embed
{ {
@@ -778,14 +739,16 @@ class GenericIE(InfoExtractor):
'playlist_mincount': 1, 'playlist_mincount': 1,
'add_ie': ['Youtube'], 'add_ie': ['Youtube'],
}, },
# Cinchcast embed # Libsyn embed
{ {
'url': 'http://undergroundwellness.com/podcasts/306-5-steps-to-permanent-gut-healing/', 'url': 'http://undergroundwellness.com/podcasts/306-5-steps-to-permanent-gut-healing/',
'info_dict': { 'info_dict': {
'id': '7141703', 'id': '3793998',
'ext': 'mp3', 'ext': 'mp3',
'upload_date': '20141126', 'upload_date': '20141126',
'title': 'Jack Tips: 5 Steps to Permanent Gut Healing', 'title': 'Underground Wellness Radio - Jack Tips: 5 Steps to Permanent Gut Healing',
'thumbnail': 'https://assets.libsyn.com/secure/item/3793998/?height=90&width=90',
'duration': 3989.0,
} }
}, },
# Cinerama player # Cinerama player
@@ -1567,16 +1530,6 @@ class GenericIE(InfoExtractor):
'title': 'Стас Намин: «Мы нарушили девственность Кремля»', 'title': 'Стас Намин: «Мы нарушили девственность Кремля»',
}, },
}, },
{
# vzaar embed
'url': 'http://help.vzaar.com/article/165-embedding-video',
'md5': '7e3919d9d2620b89e3e00bec7fe8c9d4',
'info_dict': {
'id': '8707641',
'ext': 'mp4',
'title': 'Building A Business Online: Principal Chairs Q & A',
},
},
{ {
# multiple HTML5 videos on one page # multiple HTML5 videos on one page
'url': 'https://www.paragon-software.com/home/rk-free/keyscenarios.html', 'url': 'https://www.paragon-software.com/home/rk-free/keyscenarios.html',

View File

@@ -1,145 +0,0 @@
from .common import InfoExtractor
from ..utils import (
int_or_none,
float_or_none,
qualities,
ExtractorError,
)
class GfycatIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|giant|thumbs)\.)?gfycat\.com/(?i:ru/|ifr/|gifs/detail/)?(?P<id>[^-/?#\."\']+)'
_EMBED_REGEX = [rf'<(?:iframe|source)[^>]+\bsrc=["\'](?P<url>{_VALID_URL})']
_TESTS = [{
'url': 'http://gfycat.com/DeadlyDecisiveGermanpinscher',
'info_dict': {
'id': 'DeadlyDecisiveGermanpinscher',
'ext': 'mp4',
'title': 'Ghost in the Shell',
'timestamp': 1410656006,
'upload_date': '20140914',
'uploader': 'anonymous',
'duration': 10.4,
'view_count': int,
'like_count': int,
'categories': list,
'age_limit': 0,
'uploader_id': 'anonymous',
'description': '',
}
}, {
'url': 'http://gfycat.com/ifr/JauntyTimelyAmazontreeboa',
'info_dict': {
'id': 'JauntyTimelyAmazontreeboa',
'ext': 'mp4',
'title': 'JauntyTimelyAmazontreeboa',
'timestamp': 1411720126,
'upload_date': '20140926',
'uploader': 'anonymous',
'duration': 3.52,
'view_count': int,
'like_count': int,
'categories': list,
'age_limit': 0,
'uploader_id': 'anonymous',
'description': '',
}
}, {
'url': 'https://gfycat.com/alienatedsolidgreathornedowl',
'info_dict': {
'id': 'alienatedsolidgreathornedowl',
'ext': 'mp4',
'upload_date': '20211226',
'uploader_id': 'reactions',
'timestamp': 1640536930,
'like_count': int,
'description': '',
'title': 'Ingrid Michaelson, Zooey Deschanel - Merry Christmas Happy New Year',
'categories': list,
'age_limit': 0,
'duration': 2.9583333333333335,
'uploader': 'Reaction GIFs',
'view_count': int,
}
}, {
'url': 'https://gfycat.com/ru/RemarkableDrearyAmurstarfish',
'only_matching': True
}, {
'url': 'https://gfycat.com/gifs/detail/UnconsciousLankyIvorygull',
'only_matching': True
}, {
'url': 'https://gfycat.com/acceptablehappygoluckyharborporpoise-baseball',
'only_matching': True
}, {
'url': 'https://thumbs.gfycat.com/acceptablehappygoluckyharborporpoise-size_restricted.gif',
'only_matching': True
}, {
'url': 'https://giant.gfycat.com/acceptablehappygoluckyharborporpoise.mp4',
'only_matching': True
}, {
'url': 'http://gfycat.com/IFR/JauntyTimelyAmazontreeboa',
'only_matching': True
}]
def _real_extract(self, url):
video_id = self._match_id(url)
gfy = self._download_json(
'https://api.gfycat.com/v1/gfycats/%s' % video_id,
video_id, 'Downloading video info')
if 'error' in gfy:
raise ExtractorError('Gfycat said: ' + gfy['error'], expected=True)
gfy = gfy['gfyItem']
title = gfy.get('title') or gfy['gfyName']
description = gfy.get('description')
timestamp = int_or_none(gfy.get('createDate'))
uploader = gfy.get('userName') or gfy.get('username')
view_count = int_or_none(gfy.get('views'))
like_count = int_or_none(gfy.get('likes'))
dislike_count = int_or_none(gfy.get('dislikes'))
age_limit = 18 if gfy.get('nsfw') == '1' else 0
width = int_or_none(gfy.get('width'))
height = int_or_none(gfy.get('height'))
fps = int_or_none(gfy.get('frameRate'))
num_frames = int_or_none(gfy.get('numFrames'))
duration = float_or_none(num_frames, fps) if num_frames and fps else None
categories = gfy.get('tags') or gfy.get('extraLemmas') or []
FORMATS = ('gif', 'webm', 'mp4')
quality = qualities(FORMATS)
formats = []
for format_id in FORMATS:
video_url = gfy.get('%sUrl' % format_id)
if not video_url:
continue
filesize = int_or_none(gfy.get('%sSize' % format_id))
formats.append({
'url': video_url,
'format_id': format_id,
'width': width,
'height': height,
'fps': fps,
'filesize': filesize,
'quality': quality(format_id),
})
return {
'id': video_id,
'title': title,
'description': description,
'timestamp': timestamp,
'uploader': gfy.get('userDisplayName') or uploader,
'uploader_id': uploader,
'duration': duration,
'view_count': view_count,
'like_count': like_count,
'dislike_count': dislike_count,
'categories': categories,
'age_limit': age_limit,
'formats': formats,
}

View File

@@ -31,7 +31,6 @@ class GrouponIE(InfoExtractor):
} }
_PROVIDERS = { _PROVIDERS = {
'ooyala': ('ooyala:%s', 'Ooyala'),
'youtube': ('%s', 'Youtube'), 'youtube': ('%s', 'Youtube'),
} }

Some files were not shown because too many files have changed in this diff Show More