mirror of
https://github.com/yt-dlp/yt-dlp.git
synced 2025-07-10 07:18:33 +00:00
Merge remote-tracking branch 'upstream/master' into wait-retries
This commit is contained in:
commit
65e217714d
24
.github/ISSUE_TEMPLATE/1_broken_site.yml
vendored
24
.github/ISSUE_TEMPLATE/1_broken_site.yml
vendored
@ -2,13 +2,11 @@ name: Broken site support
|
|||||||
description: Report issue with yt-dlp on a supported site
|
description: Report issue with yt-dlp on a supported site
|
||||||
labels: [triage, site-bug]
|
labels: [triage, site-bug]
|
||||||
body:
|
body:
|
||||||
- type: checkboxes
|
- type: markdown
|
||||||
attributes:
|
attributes:
|
||||||
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
|
value: |
|
||||||
description: Fill all fields even if you think it is irrelevant for the issue
|
> [!IMPORTANT]
|
||||||
options:
|
> Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
|
||||||
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
|
|
||||||
required: true
|
|
||||||
- type: checkboxes
|
- type: checkboxes
|
||||||
id: checklist
|
id: checklist
|
||||||
attributes:
|
attributes:
|
||||||
@ -24,9 +22,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
|
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar issues **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
|
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
|
||||||
- type: input
|
- type: input
|
||||||
@ -47,6 +43,8 @@ body:
|
|||||||
id: verbose
|
id: verbose
|
||||||
attributes:
|
attributes:
|
||||||
label: Provide verbose output that clearly demonstrates the problem
|
label: Provide verbose output that clearly demonstrates the problem
|
||||||
|
description: |
|
||||||
|
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
|
||||||
options:
|
options:
|
||||||
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
||||||
required: true
|
required: true
|
||||||
@ -78,11 +76,3 @@ body:
|
|||||||
render: shell
|
render: shell
|
||||||
validations:
|
validations:
|
||||||
required: true
|
required: true
|
||||||
- type: markdown
|
|
||||||
attributes:
|
|
||||||
value: |
|
|
||||||
> [!CAUTION]
|
|
||||||
> ### GitHub is experiencing a high volume of malicious spam comments.
|
|
||||||
> ### If you receive any replies asking you download a file, do NOT follow the download links!
|
|
||||||
>
|
|
||||||
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.
|
|
||||||
|
@ -2,13 +2,11 @@ name: Site support request
|
|||||||
description: Request support for a new site
|
description: Request support for a new site
|
||||||
labels: [triage, site-request]
|
labels: [triage, site-request]
|
||||||
body:
|
body:
|
||||||
- type: checkboxes
|
- type: markdown
|
||||||
attributes:
|
attributes:
|
||||||
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
|
value: |
|
||||||
description: Fill all fields even if you think it is irrelevant for the issue
|
> [!IMPORTANT]
|
||||||
options:
|
> Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
|
||||||
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
|
|
||||||
required: true
|
|
||||||
- type: checkboxes
|
- type: checkboxes
|
||||||
id: checklist
|
id: checklist
|
||||||
attributes:
|
attributes:
|
||||||
@ -24,9 +22,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
|
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar requests **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and am willing to share it if required
|
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and am willing to share it if required
|
||||||
- type: input
|
- type: input
|
||||||
@ -59,6 +55,8 @@ body:
|
|||||||
id: verbose
|
id: verbose
|
||||||
attributes:
|
attributes:
|
||||||
label: Provide verbose output that clearly demonstrates the problem
|
label: Provide verbose output that clearly demonstrates the problem
|
||||||
|
description: |
|
||||||
|
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
|
||||||
options:
|
options:
|
||||||
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
||||||
required: true
|
required: true
|
||||||
@ -90,11 +88,3 @@ body:
|
|||||||
render: shell
|
render: shell
|
||||||
validations:
|
validations:
|
||||||
required: true
|
required: true
|
||||||
- type: markdown
|
|
||||||
attributes:
|
|
||||||
value: |
|
|
||||||
> [!CAUTION]
|
|
||||||
> ### GitHub is experiencing a high volume of malicious spam comments.
|
|
||||||
> ### If you receive any replies asking you download a file, do NOT follow the download links!
|
|
||||||
>
|
|
||||||
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.
|
|
||||||
|
@ -1,14 +1,12 @@
|
|||||||
name: Site feature request
|
name: Site feature request
|
||||||
description: Request a new functionality for a supported site
|
description: Request new functionality for a site supported by yt-dlp
|
||||||
labels: [triage, site-enhancement]
|
labels: [triage, site-enhancement]
|
||||||
body:
|
body:
|
||||||
- type: checkboxes
|
- type: markdown
|
||||||
attributes:
|
attributes:
|
||||||
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
|
value: |
|
||||||
description: Fill all fields even if you think it is irrelevant for the issue
|
> [!IMPORTANT]
|
||||||
options:
|
> Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
|
||||||
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
|
|
||||||
required: true
|
|
||||||
- type: checkboxes
|
- type: checkboxes
|
||||||
id: checklist
|
id: checklist
|
||||||
attributes:
|
attributes:
|
||||||
@ -22,9 +20,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
|
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar requests **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
|
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
|
||||||
- type: input
|
- type: input
|
||||||
@ -55,6 +51,8 @@ body:
|
|||||||
id: verbose
|
id: verbose
|
||||||
attributes:
|
attributes:
|
||||||
label: Provide verbose output that clearly demonstrates the problem
|
label: Provide verbose output that clearly demonstrates the problem
|
||||||
|
description: |
|
||||||
|
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
|
||||||
options:
|
options:
|
||||||
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
||||||
required: true
|
required: true
|
||||||
@ -86,11 +84,3 @@ body:
|
|||||||
render: shell
|
render: shell
|
||||||
validations:
|
validations:
|
||||||
required: true
|
required: true
|
||||||
- type: markdown
|
|
||||||
attributes:
|
|
||||||
value: |
|
|
||||||
> [!CAUTION]
|
|
||||||
> ### GitHub is experiencing a high volume of malicious spam comments.
|
|
||||||
> ### If you receive any replies asking you download a file, do NOT follow the download links!
|
|
||||||
>
|
|
||||||
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.
|
|
||||||
|
28
.github/ISSUE_TEMPLATE/4_bug_report.yml
vendored
28
.github/ISSUE_TEMPLATE/4_bug_report.yml
vendored
@ -2,13 +2,11 @@ name: Core bug report
|
|||||||
description: Report a bug unrelated to any particular site or extractor
|
description: Report a bug unrelated to any particular site or extractor
|
||||||
labels: [triage, bug]
|
labels: [triage, bug]
|
||||||
body:
|
body:
|
||||||
- type: checkboxes
|
- type: markdown
|
||||||
attributes:
|
attributes:
|
||||||
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
|
value: |
|
||||||
description: Fill all fields even if you think it is irrelevant for the issue
|
> [!IMPORTANT]
|
||||||
options:
|
> Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
|
||||||
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
|
|
||||||
required: true
|
|
||||||
- type: checkboxes
|
- type: checkboxes
|
||||||
id: checklist
|
id: checklist
|
||||||
attributes:
|
attributes:
|
||||||
@ -20,13 +18,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
||||||
required: true
|
required: true
|
||||||
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
|
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar issues **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
|
|
||||||
required: true
|
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- type: textarea
|
- type: textarea
|
||||||
id: description
|
id: description
|
||||||
@ -40,6 +32,8 @@ body:
|
|||||||
id: verbose
|
id: verbose
|
||||||
attributes:
|
attributes:
|
||||||
label: Provide verbose output that clearly demonstrates the problem
|
label: Provide verbose output that clearly demonstrates the problem
|
||||||
|
description: |
|
||||||
|
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
|
||||||
options:
|
options:
|
||||||
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
||||||
required: true
|
required: true
|
||||||
@ -71,11 +65,3 @@ body:
|
|||||||
render: shell
|
render: shell
|
||||||
validations:
|
validations:
|
||||||
required: true
|
required: true
|
||||||
- type: markdown
|
|
||||||
attributes:
|
|
||||||
value: |
|
|
||||||
> [!CAUTION]
|
|
||||||
> ### GitHub is experiencing a high volume of malicious spam comments.
|
|
||||||
> ### If you receive any replies asking you download a file, do NOT follow the download links!
|
|
||||||
>
|
|
||||||
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.
|
|
||||||
|
26
.github/ISSUE_TEMPLATE/5_feature_request.yml
vendored
26
.github/ISSUE_TEMPLATE/5_feature_request.yml
vendored
@ -1,14 +1,12 @@
|
|||||||
name: Feature request
|
name: Feature request
|
||||||
description: Request a new functionality unrelated to any particular site or extractor
|
description: Request a new feature unrelated to any particular site or extractor
|
||||||
labels: [triage, enhancement]
|
labels: [triage, enhancement]
|
||||||
body:
|
body:
|
||||||
- type: checkboxes
|
- type: markdown
|
||||||
attributes:
|
attributes:
|
||||||
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
|
value: |
|
||||||
description: Fill all fields even if you think it is irrelevant for the issue
|
> [!IMPORTANT]
|
||||||
options:
|
> Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
|
||||||
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
|
|
||||||
required: true
|
|
||||||
- type: checkboxes
|
- type: checkboxes
|
||||||
id: checklist
|
id: checklist
|
||||||
attributes:
|
attributes:
|
||||||
@ -22,9 +20,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar requests **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- type: textarea
|
- type: textarea
|
||||||
id: description
|
id: description
|
||||||
@ -38,6 +34,8 @@ body:
|
|||||||
id: verbose
|
id: verbose
|
||||||
attributes:
|
attributes:
|
||||||
label: Provide verbose output that clearly demonstrates the problem
|
label: Provide verbose output that clearly demonstrates the problem
|
||||||
|
description: |
|
||||||
|
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
|
||||||
options:
|
options:
|
||||||
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
||||||
- label: "If using API, add `'verbose': True` to `YoutubeDL` params instead"
|
- label: "If using API, add `'verbose': True` to `YoutubeDL` params instead"
|
||||||
@ -65,11 +63,3 @@ body:
|
|||||||
[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
|
[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
|
||||||
<more lines>
|
<more lines>
|
||||||
render: shell
|
render: shell
|
||||||
- type: markdown
|
|
||||||
attributes:
|
|
||||||
value: |
|
|
||||||
> [!CAUTION]
|
|
||||||
> ### GitHub is experiencing a high volume of malicious spam comments.
|
|
||||||
> ### If you receive any replies asking you download a file, do NOT follow the download links!
|
|
||||||
>
|
|
||||||
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.
|
|
||||||
|
26
.github/ISSUE_TEMPLATE/6_question.yml
vendored
26
.github/ISSUE_TEMPLATE/6_question.yml
vendored
@ -1,14 +1,12 @@
|
|||||||
name: Ask question
|
name: Ask question
|
||||||
description: Ask yt-dlp related question
|
description: Ask a question about using yt-dlp
|
||||||
labels: [question]
|
labels: [question]
|
||||||
body:
|
body:
|
||||||
- type: checkboxes
|
- type: markdown
|
||||||
attributes:
|
attributes:
|
||||||
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
|
value: |
|
||||||
description: Fill all fields even if you think it is irrelevant for the issue
|
> [!IMPORTANT]
|
||||||
options:
|
> Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
|
||||||
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
|
|
||||||
required: true
|
|
||||||
- type: markdown
|
- type: markdown
|
||||||
attributes:
|
attributes:
|
||||||
value: |
|
value: |
|
||||||
@ -28,9 +26,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
|
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar questions **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- type: textarea
|
- type: textarea
|
||||||
id: question
|
id: question
|
||||||
@ -44,6 +40,8 @@ body:
|
|||||||
id: verbose
|
id: verbose
|
||||||
attributes:
|
attributes:
|
||||||
label: Provide verbose output that clearly demonstrates the problem
|
label: Provide verbose output that clearly demonstrates the problem
|
||||||
|
description: |
|
||||||
|
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
|
||||||
options:
|
options:
|
||||||
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
||||||
- label: "If using API, add `'verbose': True` to `YoutubeDL` params instead"
|
- label: "If using API, add `'verbose': True` to `YoutubeDL` params instead"
|
||||||
@ -71,11 +69,3 @@ body:
|
|||||||
[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
|
[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
|
||||||
<more lines>
|
<more lines>
|
||||||
render: shell
|
render: shell
|
||||||
- type: markdown
|
|
||||||
attributes:
|
|
||||||
value: |
|
|
||||||
> [!CAUTION]
|
|
||||||
> ### GitHub is experiencing a high volume of malicious spam comments.
|
|
||||||
> ### If you receive any replies asking you download a file, do NOT follow the download links!
|
|
||||||
>
|
|
||||||
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.
|
|
||||||
|
7
.github/ISSUE_TEMPLATE/config.yml
vendored
7
.github/ISSUE_TEMPLATE/config.yml
vendored
@ -1,8 +1,5 @@
|
|||||||
blank_issues_enabled: false
|
blank_issues_enabled: false
|
||||||
contact_links:
|
contact_links:
|
||||||
- name: Get help from the community on Discord
|
- name: Get help on Discord
|
||||||
url: https://discord.gg/H5MNcFW63r
|
url: https://discord.gg/H5MNcFW63r
|
||||||
about: Join the yt-dlp Discord for community-powered support!
|
about: Join the yt-dlp Discord server for support and discussion
|
||||||
- name: Matrix Bridge to the Discord server
|
|
||||||
url: https://matrix.to/#/#yt-dlp:matrix.org
|
|
||||||
about: For those who do not want to use Discord
|
|
||||||
|
@ -18,9 +18,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
|
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar issues **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
|
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
|
||||||
- type: input
|
- type: input
|
||||||
|
@ -18,9 +18,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
|
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar requests **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and am willing to share it if required
|
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and am willing to share it if required
|
||||||
- type: input
|
- type: input
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
name: Site feature request
|
name: Site feature request
|
||||||
description: Request a new functionality for a supported site
|
description: Request new functionality for a site supported by yt-dlp
|
||||||
labels: [triage, site-enhancement]
|
labels: [triage, site-enhancement]
|
||||||
body:
|
body:
|
||||||
%(no_skip)s
|
%(no_skip)s
|
||||||
@ -16,9 +16,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
|
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar requests **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
|
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
|
||||||
- type: input
|
- type: input
|
||||||
|
8
.github/ISSUE_TEMPLATE_tmpl/4_bug_report.yml
vendored
8
.github/ISSUE_TEMPLATE_tmpl/4_bug_report.yml
vendored
@ -14,13 +14,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
||||||
required: true
|
required: true
|
||||||
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
|
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar issues **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
|
|
||||||
required: true
|
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- type: textarea
|
- type: textarea
|
||||||
id: description
|
id: description
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
name: Feature request
|
name: Feature request
|
||||||
description: Request a new functionality unrelated to any particular site or extractor
|
description: Request a new feature unrelated to any particular site or extractor
|
||||||
labels: [triage, enhancement]
|
labels: [triage, enhancement]
|
||||||
body:
|
body:
|
||||||
%(no_skip)s
|
%(no_skip)s
|
||||||
@ -16,9 +16,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
|
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar requests **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- type: textarea
|
- type: textarea
|
||||||
id: description
|
id: description
|
||||||
|
6
.github/ISSUE_TEMPLATE_tmpl/6_question.yml
vendored
6
.github/ISSUE_TEMPLATE_tmpl/6_question.yml
vendored
@ -1,5 +1,5 @@
|
|||||||
name: Ask question
|
name: Ask question
|
||||||
description: Ask yt-dlp related question
|
description: Ask a question about using yt-dlp
|
||||||
labels: [question]
|
labels: [question]
|
||||||
body:
|
body:
|
||||||
%(no_skip)s
|
%(no_skip)s
|
||||||
@ -22,9 +22,7 @@ body:
|
|||||||
required: true
|
required: true
|
||||||
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
|
||||||
required: true
|
required: true
|
||||||
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
|
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar questions **including closed ones**. DO NOT post duplicates
|
||||||
required: true
|
|
||||||
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
|
|
||||||
required: true
|
required: true
|
||||||
- type: textarea
|
- type: textarea
|
||||||
id: question
|
id: question
|
||||||
|
37
.github/PULL_REQUEST_TEMPLATE.md
vendored
37
.github/PULL_REQUEST_TEMPLATE.md
vendored
@ -1,14 +1,17 @@
|
|||||||
**IMPORTANT**: PRs without the template will be CLOSED
|
<!--
|
||||||
|
**IMPORTANT**: PRs without the template will be CLOSED
|
||||||
|
|
||||||
|
Due to the high volume of pull requests, it may be a while before your PR is reviewed.
|
||||||
|
Please try to keep your pull request focused on a single bugfix or new feature.
|
||||||
|
Pull requests with a vast scope and/or very large diff will take much longer to review.
|
||||||
|
It is recommended for new contributors to stick to smaller pull requests, so you can receive much more immediate feedback as you familiarize yourself with the codebase.
|
||||||
|
|
||||||
|
PLEASE AVOID FORCE-PUSHING after opening a PR, as it makes reviewing more difficult.
|
||||||
|
-->
|
||||||
|
|
||||||
### Description of your *pull request* and other information
|
### Description of your *pull request* and other information
|
||||||
|
|
||||||
<!--
|
ADD DETAILED DESCRIPTION HERE
|
||||||
|
|
||||||
Explanation of your *pull request* in arbitrary form goes here. Please **make sure the description explains the purpose and effect** of your *pull request* and is worded well enough to be understood. Provide as much **context and examples** as possible
|
|
||||||
|
|
||||||
-->
|
|
||||||
|
|
||||||
ADD DESCRIPTION HERE
|
|
||||||
|
|
||||||
Fixes #
|
Fixes #
|
||||||
|
|
||||||
@ -16,24 +19,22 @@ ### Description of your *pull request* and other information
|
|||||||
<details open><summary>Template</summary> <!-- OPEN is intentional -->
|
<details open><summary>Template</summary> <!-- OPEN is intentional -->
|
||||||
|
|
||||||
<!--
|
<!--
|
||||||
|
# PLEASE FOLLOW THE GUIDE BELOW
|
||||||
|
|
||||||
# PLEASE FOLLOW THE GUIDE BELOW
|
- You will be asked some questions, please read them **carefully** and answer honestly
|
||||||
|
- Put an `x` into all the boxes `[ ]` relevant to your *pull request* (like [x])
|
||||||
- You will be asked some questions, please read them **carefully** and answer honestly
|
- Use *Preview* tab to see what your *pull request* will actually look like
|
||||||
- Put an `x` into all the boxes `[ ]` relevant to your *pull request* (like [x])
|
|
||||||
- Use *Preview* tab to see how your *pull request* will actually look like
|
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
|
||||||
### Before submitting a *pull request* make sure you have:
|
### Before submitting a *pull request* make sure you have:
|
||||||
- [ ] At least skimmed through [contributing guidelines](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#developer-instructions) including [yt-dlp coding conventions](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#yt-dlp-coding-conventions)
|
- [ ] At least skimmed through [contributing guidelines](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#developer-instructions) including [yt-dlp coding conventions](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#yt-dlp-coding-conventions)
|
||||||
- [ ] [Searched](https://github.com/yt-dlp/yt-dlp/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
|
- [ ] [Searched](https://github.com/yt-dlp/yt-dlp/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
|
||||||
|
|
||||||
### In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check all of the following options that apply:
|
### In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check those that apply and remove the others:
|
||||||
- [ ] I am the original author of this code and I am willing to release it under [Unlicense](http://unlicense.org/)
|
- [ ] I am the original author of the code in this PR, and I am willing to release it under [Unlicense](http://unlicense.org/)
|
||||||
- [ ] I am not the original author of this code but it is in public domain or released under [Unlicense](http://unlicense.org/) (provide reliable evidence)
|
- [ ] I am not the original author of the code in this PR, but it is in the public domain or released under [Unlicense](http://unlicense.org/) (provide reliable evidence)
|
||||||
|
|
||||||
### What is the purpose of your *pull request*?
|
### What is the purpose of your *pull request*? Check those that apply and remove the others:
|
||||||
- [ ] Fix or improvement to an extractor (Make sure to add/update tests)
|
- [ ] Fix or improvement to an extractor (Make sure to add/update tests)
|
||||||
- [ ] New extractor ([Piracy websites will not be accepted](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy))
|
- [ ] New extractor ([Piracy websites will not be accepted](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy))
|
||||||
- [ ] Core bug fix/improvement
|
- [ ] Core bug fix/improvement
|
||||||
|
6
.github/workflows/codeql.yml
vendored
6
.github/workflows/codeql.yml
vendored
@ -33,7 +33,7 @@ jobs:
|
|||||||
|
|
||||||
# Initializes the CodeQL tools for scanning.
|
# Initializes the CodeQL tools for scanning.
|
||||||
- name: Initialize CodeQL
|
- name: Initialize CodeQL
|
||||||
uses: github/codeql-action/init@v2
|
uses: github/codeql-action/init@v3
|
||||||
with:
|
with:
|
||||||
languages: ${{ matrix.language }}
|
languages: ${{ matrix.language }}
|
||||||
# If you wish to specify custom queries, you can do so here or in a config file.
|
# If you wish to specify custom queries, you can do so here or in a config file.
|
||||||
@ -47,7 +47,7 @@ jobs:
|
|||||||
# Autobuild attempts to build any compiled languages (C/C++, C#, Go, Java, or Swift).
|
# Autobuild attempts to build any compiled languages (C/C++, C#, Go, Java, or Swift).
|
||||||
# If this step fails, then you should remove it and run the build manually (see below)
|
# If this step fails, then you should remove it and run the build manually (see below)
|
||||||
- name: Autobuild
|
- name: Autobuild
|
||||||
uses: github/codeql-action/autobuild@v2
|
uses: github/codeql-action/autobuild@v3
|
||||||
|
|
||||||
# ℹ️ Command-line programs to run using the OS shell.
|
# ℹ️ Command-line programs to run using the OS shell.
|
||||||
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
|
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
|
||||||
@ -60,6 +60,6 @@ jobs:
|
|||||||
# ./location_of_script_within_repo/buildscript.sh
|
# ./location_of_script_within_repo/buildscript.sh
|
||||||
|
|
||||||
- name: Perform CodeQL Analysis
|
- name: Perform CodeQL Analysis
|
||||||
uses: github/codeql-action/analyze@v2
|
uses: github/codeql-action/analyze@v3
|
||||||
with:
|
with:
|
||||||
category: "/language:${{matrix.language}}"
|
category: "/language:${{matrix.language}}"
|
||||||
|
@ -736,3 +736,9 @@ NecroRomnt
|
|||||||
pjrobertson
|
pjrobertson
|
||||||
subsense
|
subsense
|
||||||
test20140
|
test20140
|
||||||
|
arantius
|
||||||
|
entourage8
|
||||||
|
lfavole
|
||||||
|
mp3butcher
|
||||||
|
slipinthedove
|
||||||
|
YoshiTabletopGamer
|
||||||
|
43
Changelog.md
43
Changelog.md
@ -4,6 +4,49 @@ # Changelog
|
|||||||
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
|
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
|
||||||
-->
|
-->
|
||||||
|
|
||||||
|
### 2025.02.19
|
||||||
|
|
||||||
|
#### Core changes
|
||||||
|
- **jsinterp**
|
||||||
|
- [Add `js_number_to_string`](https://github.com/yt-dlp/yt-dlp/commit/0d9f061d38c3a4da61972e2adad317079f2f1c84) ([#12110](https://github.com/yt-dlp/yt-dlp/issues/12110)) by [Grub4K](https://github.com/Grub4K)
|
||||||
|
- [Improve zeroise](https://github.com/yt-dlp/yt-dlp/commit/4ca8c44a073d5aa3a3e3112c35b2b23d6ce25ac6) ([#12313](https://github.com/yt-dlp/yt-dlp/issues/12313)) by [seproDev](https://github.com/seproDev)
|
||||||
|
|
||||||
|
#### Extractor changes
|
||||||
|
- **acast**: [Support shows.acast.com URLs](https://github.com/yt-dlp/yt-dlp/commit/57c717fee4bfbc9309845bbb48901b72e4b69304) ([#12223](https://github.com/yt-dlp/yt-dlp/issues/12223)) by [barsnick](https://github.com/barsnick)
|
||||||
|
- **cwtv**
|
||||||
|
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/18a28514e306e822eab4f3a79c76d515bf076406) ([#12207](https://github.com/yt-dlp/yt-dlp/issues/12207)) by [arantius](https://github.com/arantius)
|
||||||
|
- movie: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/03c3d705778c07739e0034b51490877cffdc0983) ([#12227](https://github.com/yt-dlp/yt-dlp/issues/12227)) by [bashonly](https://github.com/bashonly)
|
||||||
|
- **digiview**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/f53553087d3fde9dcd61d6e9f98caf09db1d8ef2) ([#9902](https://github.com/yt-dlp/yt-dlp/issues/9902)) by [lfavole](https://github.com/lfavole)
|
||||||
|
- **dropbox**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/861aeec449c8f3c062d962945b234ff0341f61f3) ([#12228](https://github.com/yt-dlp/yt-dlp/issues/12228)) by [bashonly](https://github.com/bashonly)
|
||||||
|
- **francetv**
|
||||||
|
- site
|
||||||
|
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/817483ccc68aed6049ed9c4a2ffae44ca82d2b1c) ([#12236](https://github.com/yt-dlp/yt-dlp/issues/12236)) by [bashonly](https://github.com/bashonly)
|
||||||
|
- [Fix livestream extraction](https://github.com/yt-dlp/yt-dlp/commit/1295bbedd45fa8d9bc3f7a194864ae280297848e) ([#12316](https://github.com/yt-dlp/yt-dlp/issues/12316)) by [bashonly](https://github.com/bashonly)
|
||||||
|
- **francetvinfo.fr**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/5c4c2ddfaa47988b4d50c1ad4988badc0b4f30c2) ([#12402](https://github.com/yt-dlp/yt-dlp/issues/12402)) by [bashonly](https://github.com/bashonly)
|
||||||
|
- **gem.cbc.ca**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/5271ef48c6f61c145e03e18e960995d2e651d205) ([#12404](https://github.com/yt-dlp/yt-dlp/issues/12404)) by [bashonly](https://github.com/bashonly), [dirkf](https://github.com/dirkf)
|
||||||
|
- **generic**: [Extract `live_status` for DASH manifest URLs](https://github.com/yt-dlp/yt-dlp/commit/19edaa44fcd375f54e63d6227b092f5252d3e889) ([#12256](https://github.com/yt-dlp/yt-dlp/issues/12256)) by [mp3butcher](https://github.com/mp3butcher)
|
||||||
|
- **globo**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f8d0161455f00add65585ca1a476a7b5d56f5f96) ([#11795](https://github.com/yt-dlp/yt-dlp/issues/11795)) by [slipinthedove](https://github.com/slipinthedove), [YoshiTabletopGamer](https://github.com/YoshiTabletopGamer)
|
||||||
|
- **goplay**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/d59f14a0a7a8b55e6bf468237def62b73ab4a517) ([#12237](https://github.com/yt-dlp/yt-dlp/issues/12237)) by [alard](https://github.com/alard)
|
||||||
|
- **pbs**: [Support www.thirteen.org URLs](https://github.com/yt-dlp/yt-dlp/commit/9fb8ab2ff67fb699f60cce09163a580976e90c0e) ([#11191](https://github.com/yt-dlp/yt-dlp/issues/11191)) by [rohieb](https://github.com/rohieb)
|
||||||
|
- **reddit**: [Bypass gated subreddit warning](https://github.com/yt-dlp/yt-dlp/commit/6ca23ffaa4663cb552f937f0b1e9769b66db11bd) ([#12335](https://github.com/yt-dlp/yt-dlp/issues/12335)) by [bashonly](https://github.com/bashonly)
|
||||||
|
- **twitter**: [Fix syndication token generation](https://github.com/yt-dlp/yt-dlp/commit/14cd7f3443c6da4d49edaefcc12da9dee86e243e) ([#12107](https://github.com/yt-dlp/yt-dlp/issues/12107)) by [Grub4K](https://github.com/Grub4K), [pjrobertson](https://github.com/pjrobertson)
|
||||||
|
- **youtube**
|
||||||
|
- [Retry on more critical requests](https://github.com/yt-dlp/yt-dlp/commit/d48e612609d012abbea3785be4d26d78a014abb2) ([#12339](https://github.com/yt-dlp/yt-dlp/issues/12339)) by [coletdjnz](https://github.com/coletdjnz)
|
||||||
|
- [nsig workaround for `tce` player JS](https://github.com/yt-dlp/yt-dlp/commit/ec17fb16e8d69d4e3e10fb73bf3221be8570dfee) ([#12401](https://github.com/yt-dlp/yt-dlp/issues/12401)) by [bashonly](https://github.com/bashonly)
|
||||||
|
- **zdf**: [Extract more metadata](https://github.com/yt-dlp/yt-dlp/commit/241ace4f104d50fdf7638f9203927aefcf57a1f7) ([#9565](https://github.com/yt-dlp/yt-dlp/issues/9565)) by [StefanLobbenmeier](https://github.com/StefanLobbenmeier) (With fixes in [e7882b6](https://github.com/yt-dlp/yt-dlp/commit/e7882b682b959e476d8454911655b3e9b14c79b2) by [bashonly](https://github.com/bashonly))
|
||||||
|
|
||||||
|
#### Downloader changes
|
||||||
|
- **hls**
|
||||||
|
- [Fix `BYTERANGE` logic](https://github.com/yt-dlp/yt-dlp/commit/10b7ff68e98f17655e31952f6e17120b2d7dda96) ([#11972](https://github.com/yt-dlp/yt-dlp/issues/11972)) by [entourage8](https://github.com/entourage8)
|
||||||
|
- [Support `--write-pages` for m3u8 media playlists](https://github.com/yt-dlp/yt-dlp/commit/be69468752ff598cacee57bb80533deab2367a5d) ([#12333](https://github.com/yt-dlp/yt-dlp/issues/12333)) by [bashonly](https://github.com/bashonly)
|
||||||
|
- [Support `hls_media_playlist_data` format field](https://github.com/yt-dlp/yt-dlp/commit/c987be0acb6872c6561f28aa28171e803393d851) ([#12322](https://github.com/yt-dlp/yt-dlp/issues/12322)) by [bashonly](https://github.com/bashonly)
|
||||||
|
|
||||||
|
#### Misc. changes
|
||||||
|
- [Improve Issue/PR templates](https://github.com/yt-dlp/yt-dlp/commit/517ddf3c3f12560ab93e3d36244dc82db9f97818) ([#11499](https://github.com/yt-dlp/yt-dlp/issues/11499)) by [seproDev](https://github.com/seproDev) (With fixes in [4ecb833](https://github.com/yt-dlp/yt-dlp/commit/4ecb833472c90e078567b561fb7c089f1aa9587b) by [bashonly](https://github.com/bashonly))
|
||||||
|
- **cleanup**: Miscellaneous: [4985a40](https://github.com/yt-dlp/yt-dlp/commit/4985a4041770eaa0016271809a1fd950dc809a55) by [dirkf](https://github.com/dirkf), [Grub4K](https://github.com/Grub4K), [StefanLobbenmeier](https://github.com/StefanLobbenmeier)
|
||||||
|
- **docs**: [Add note to `supportedsites.md`](https://github.com/yt-dlp/yt-dlp/commit/01a63629a21781458dcbd38779898e117678f5ff) ([#12382](https://github.com/yt-dlp/yt-dlp/issues/12382)) by [seproDev](https://github.com/seproDev)
|
||||||
|
- **test**: download: [Validate and sort info dict fields](https://github.com/yt-dlp/yt-dlp/commit/208163447408c78673b08c172beafe5c310fb167) ([#12299](https://github.com/yt-dlp/yt-dlp/issues/12299)) by [bashonly](https://github.com/bashonly), [pzhlkj6612](https://github.com/pzhlkj6612)
|
||||||
|
|
||||||
### 2025.01.26
|
### 2025.01.26
|
||||||
|
|
||||||
#### Core changes
|
#### Core changes
|
||||||
|
15
README.md
15
README.md
@ -6,7 +6,6 @@
|
|||||||
[](#installation "Installation")
|
[](#installation "Installation")
|
||||||
[](https://pypi.org/project/yt-dlp "PyPI")
|
[](https://pypi.org/project/yt-dlp "PyPI")
|
||||||
[](Collaborators.md#collaborators "Donate")
|
[](Collaborators.md#collaborators "Donate")
|
||||||
[](https://matrix.to/#/#yt-dlp:matrix.org "Matrix")
|
|
||||||
[](https://discord.gg/H5MNcFW63r "Discord")
|
[](https://discord.gg/H5MNcFW63r "Discord")
|
||||||
[](supportedsites.md "Supported Sites")
|
[](supportedsites.md "Supported Sites")
|
||||||
[](LICENSE "License")
|
[](LICENSE "License")
|
||||||
@ -338,10 +337,11 @@ ## General Options:
|
|||||||
--plugin-dirs PATH Path to an additional directory to search
|
--plugin-dirs PATH Path to an additional directory to search
|
||||||
for plugins. This option can be used
|
for plugins. This option can be used
|
||||||
multiple times to add multiple directories.
|
multiple times to add multiple directories.
|
||||||
Note that this currently only works for
|
Use "default" to search the default plugin
|
||||||
extractor plugins; postprocessor plugins can
|
directories (default)
|
||||||
only be loaded from the default plugin
|
--no-plugin-dirs Clear plugin directories to search,
|
||||||
directories
|
including defaults and those provided by
|
||||||
|
previous --plugin-dirs
|
||||||
--flat-playlist Do not extract a playlist's URL result
|
--flat-playlist Do not extract a playlist's URL result
|
||||||
entries; some entry metadata may be missing
|
entries; some entry metadata may be missing
|
||||||
and downloading may be bypassed
|
and downloading may be bypassed
|
||||||
@ -1530,7 +1530,7 @@ ## Sorting Formats
|
|||||||
- `hasvid`: Gives priority to formats that have a video stream
|
- `hasvid`: Gives priority to formats that have a video stream
|
||||||
- `hasaud`: Gives priority to formats that have an audio stream
|
- `hasaud`: Gives priority to formats that have an audio stream
|
||||||
- `ie_pref`: The format preference
|
- `ie_pref`: The format preference
|
||||||
- `lang`: The language preference
|
- `lang`: The language preference as determined by the extractor (e.g. original language preferred over audio description)
|
||||||
- `quality`: The quality of the format
|
- `quality`: The quality of the format
|
||||||
- `source`: The preference of the source
|
- `source`: The preference of the source
|
||||||
- `proto`: Protocol used for download (`https`/`ftps` > `http`/`ftp` > `m3u8_native`/`m3u8` > `http_dash_segments`> `websocket_frag` > `mms`/`rtsp` > `f4f`/`f4m`)
|
- `proto`: Protocol used for download (`https`/`ftps` > `http`/`ftp` > `m3u8_native`/`m3u8` > `http_dash_segments`> `websocket_frag` > `mms`/`rtsp` > `f4f`/`f4m`)
|
||||||
@ -1816,6 +1816,9 @@ #### hotstar
|
|||||||
* `vcodec`: vcodec to ignore - one or more of `h264`, `h265`, `dvh265`
|
* `vcodec`: vcodec to ignore - one or more of `h264`, `h265`, `dvh265`
|
||||||
* `dr`: dynamic range to ignore - one or more of `sdr`, `hdr10`, `dv`
|
* `dr`: dynamic range to ignore - one or more of `sdr`, `hdr10`, `dv`
|
||||||
|
|
||||||
|
#### instagram
|
||||||
|
* `app_id`: The value of the `X-IG-App-ID` header used for API requests. Default is the web app ID, `936619743392459`
|
||||||
|
|
||||||
#### niconicochannelplus
|
#### niconicochannelplus
|
||||||
* `max_comments`: Maximum number of comments to extract - default is `120`
|
* `max_comments`: Maximum number of comments to extract - default is `120`
|
||||||
|
|
||||||
|
@ -11,11 +11,13 @@
|
|||||||
|
|
||||||
from devscripts.utils import get_filename_args, read_file, write_file
|
from devscripts.utils import get_filename_args, read_file, write_file
|
||||||
|
|
||||||
VERBOSE_TMPL = '''
|
VERBOSE = '''
|
||||||
- type: checkboxes
|
- type: checkboxes
|
||||||
id: verbose
|
id: verbose
|
||||||
attributes:
|
attributes:
|
||||||
label: Provide verbose output that clearly demonstrates the problem
|
label: Provide verbose output that clearly demonstrates the problem
|
||||||
|
description: |
|
||||||
|
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
|
||||||
options:
|
options:
|
||||||
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
|
||||||
required: true
|
required: true
|
||||||
@ -47,31 +49,23 @@
|
|||||||
render: shell
|
render: shell
|
||||||
validations:
|
validations:
|
||||||
required: true
|
required: true
|
||||||
- type: markdown
|
|
||||||
attributes:
|
|
||||||
value: |
|
|
||||||
> [!CAUTION]
|
|
||||||
> ### GitHub is experiencing a high volume of malicious spam comments.
|
|
||||||
> ### If you receive any replies asking you download a file, do NOT follow the download links!
|
|
||||||
>
|
|
||||||
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.
|
|
||||||
'''.strip()
|
'''.strip()
|
||||||
|
|
||||||
NO_SKIP = '''
|
NO_SKIP = '''
|
||||||
- type: checkboxes
|
- type: markdown
|
||||||
attributes:
|
attributes:
|
||||||
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
|
value: |
|
||||||
description: Fill all fields even if you think it is irrelevant for the issue
|
> [!IMPORTANT]
|
||||||
options:
|
> Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
|
||||||
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\\* field
|
|
||||||
required: true
|
|
||||||
'''.strip()
|
'''.strip()
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
fields = {'no_skip': NO_SKIP}
|
fields = {
|
||||||
fields['verbose'] = VERBOSE_TMPL % fields
|
'no_skip': NO_SKIP,
|
||||||
fields['verbose_optional'] = re.sub(r'(\n\s+validations:)?\n\s+required: true', '', fields['verbose'])
|
'verbose': VERBOSE,
|
||||||
|
'verbose_optional': re.sub(r'(\n\s+validations:)?\n\s+required: true', '', VERBOSE),
|
||||||
|
}
|
||||||
|
|
||||||
infile, outfile = get_filename_args(has_infile=True)
|
infile, outfile = get_filename_args(has_infile=True)
|
||||||
write_file(outfile, read_file(infile) % fields)
|
write_file(outfile, read_file(infile) % fields)
|
||||||
|
@ -10,6 +10,9 @@
|
|||||||
from inspect import getsource
|
from inspect import getsource
|
||||||
|
|
||||||
from devscripts.utils import get_filename_args, read_file, write_file
|
from devscripts.utils import get_filename_args, read_file, write_file
|
||||||
|
from yt_dlp.extractor import import_extractors
|
||||||
|
from yt_dlp.extractor.common import InfoExtractor, SearchInfoExtractor
|
||||||
|
from yt_dlp.globals import extractors
|
||||||
|
|
||||||
NO_ATTR = object()
|
NO_ATTR = object()
|
||||||
STATIC_CLASS_PROPERTIES = [
|
STATIC_CLASS_PROPERTIES = [
|
||||||
@ -38,8 +41,7 @@ def main():
|
|||||||
|
|
||||||
lazy_extractors_filename = get_filename_args(default_outfile='yt_dlp/extractor/lazy_extractors.py')
|
lazy_extractors_filename = get_filename_args(default_outfile='yt_dlp/extractor/lazy_extractors.py')
|
||||||
|
|
||||||
from yt_dlp.extractor.extractors import _ALL_CLASSES
|
import_extractors()
|
||||||
from yt_dlp.extractor.common import InfoExtractor, SearchInfoExtractor
|
|
||||||
|
|
||||||
DummyInfoExtractor = type('InfoExtractor', (InfoExtractor,), {'IE_NAME': NO_ATTR})
|
DummyInfoExtractor = type('InfoExtractor', (InfoExtractor,), {'IE_NAME': NO_ATTR})
|
||||||
module_src = '\n'.join((
|
module_src = '\n'.join((
|
||||||
@ -47,7 +49,7 @@ def main():
|
|||||||
' _module = None',
|
' _module = None',
|
||||||
*extra_ie_code(DummyInfoExtractor),
|
*extra_ie_code(DummyInfoExtractor),
|
||||||
'\nclass LazyLoadSearchExtractor(LazyLoadExtractor):\n pass\n',
|
'\nclass LazyLoadSearchExtractor(LazyLoadExtractor):\n pass\n',
|
||||||
*build_ies(_ALL_CLASSES, (InfoExtractor, SearchInfoExtractor), DummyInfoExtractor),
|
*build_ies(list(extractors.value.values()), (InfoExtractor, SearchInfoExtractor), DummyInfoExtractor),
|
||||||
))
|
))
|
||||||
|
|
||||||
write_file(lazy_extractors_filename, f'{module_src}\n')
|
write_file(lazy_extractors_filename, f'{module_src}\n')
|
||||||
@ -73,7 +75,7 @@ def build_ies(ies, bases, attr_base):
|
|||||||
if ie in ies:
|
if ie in ies:
|
||||||
names.append(ie.__name__)
|
names.append(ie.__name__)
|
||||||
|
|
||||||
yield f'\n_ALL_CLASSES = [{", ".join(names)}]'
|
yield '\n_CLASS_LOOKUP = {%s}' % ', '.join(f'{name!r}: {name}' for name in names)
|
||||||
|
|
||||||
|
|
||||||
def sort_ies(ies, ignored_bases):
|
def sort_ies(ies, ignored_bases):
|
||||||
|
@ -10,10 +10,21 @@
|
|||||||
from devscripts.utils import get_filename_args, write_file
|
from devscripts.utils import get_filename_args, write_file
|
||||||
from yt_dlp.extractor import list_extractor_classes
|
from yt_dlp.extractor import list_extractor_classes
|
||||||
|
|
||||||
|
TEMPLATE = '''\
|
||||||
|
# Supported sites
|
||||||
|
|
||||||
|
Below is a list of all extractors that are currently included with yt-dlp.
|
||||||
|
If a site is not listed here, it might still be supported by yt-dlp's embed extraction or generic extractor.
|
||||||
|
Not all sites listed here are guaranteed to work; websites are constantly changing and sometimes this breaks yt-dlp's support for them.
|
||||||
|
The only reliable way to check if a site is supported is to try it.
|
||||||
|
|
||||||
|
{ie_list}
|
||||||
|
'''
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
out = '\n'.join(ie.description() for ie in list_extractor_classes() if ie.IE_DESC is not False)
|
out = '\n'.join(ie.description() for ie in list_extractor_classes() if ie.IE_DESC is not False)
|
||||||
write_file(get_filename_args(), f'# Supported sites\n{out}\n')
|
write_file(get_filename_args(), TEMPLATE.format(ie_list=out))
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
|
@ -25,7 +25,8 @@ def parse_args():
|
|||||||
|
|
||||||
|
|
||||||
def run_tests(*tests, pattern=None, ci=False):
|
def run_tests(*tests, pattern=None, ci=False):
|
||||||
run_core = 'core' in tests or (not pattern and not tests)
|
# XXX: hatch uses `tests` if no arguments are passed
|
||||||
|
run_core = 'core' in tests or 'tests' in tests or (not pattern and not tests)
|
||||||
run_download = 'download' in tests
|
run_download = 'download' in tests
|
||||||
|
|
||||||
pytest_args = args.pytest_args or os.getenv('HATCH_TEST_ARGS', '')
|
pytest_args = args.pytest_args or os.getenv('HATCH_TEST_ARGS', '')
|
||||||
|
@ -384,6 +384,7 @@ select = [
|
|||||||
"W391",
|
"W391",
|
||||||
"W504",
|
"W504",
|
||||||
]
|
]
|
||||||
|
exclude = "*/extractor/lazy_extractors.py,*venv*,*/test/testdata/sigs/player-*.js,.idea,.vscode"
|
||||||
|
|
||||||
[tool.pytest.ini_options]
|
[tool.pytest.ini_options]
|
||||||
addopts = "-ra -v --strict-markers"
|
addopts = "-ra -v --strict-markers"
|
||||||
|
@ -1,4 +1,10 @@
|
|||||||
# Supported sites
|
# Supported sites
|
||||||
|
|
||||||
|
Below is a list of all extractors that are currently included with yt-dlp.
|
||||||
|
If a site is not listed here, it might still be supported by yt-dlp's embed extraction or generic extractor.
|
||||||
|
Not all sites listed here are guaranteed to work; websites are constantly changing and sometimes this breaks yt-dlp's support for them.
|
||||||
|
The only reliable way to check if a site is supported is to try it.
|
||||||
|
|
||||||
- **17live**
|
- **17live**
|
||||||
- **17live:clip**
|
- **17live:clip**
|
||||||
- **1News**: 1news.co.nz article videos
|
- **1News**: 1news.co.nz article videos
|
||||||
@ -314,7 +320,8 @@ # Supported sites
|
|||||||
- **curiositystream**: [*curiositystream*](## "netrc machine")
|
- **curiositystream**: [*curiositystream*](## "netrc machine")
|
||||||
- **curiositystream:collections**: [*curiositystream*](## "netrc machine")
|
- **curiositystream:collections**: [*curiositystream*](## "netrc machine")
|
||||||
- **curiositystream:series**: [*curiositystream*](## "netrc machine")
|
- **curiositystream:series**: [*curiositystream*](## "netrc machine")
|
||||||
- **CWTV**
|
- **cwtv**
|
||||||
|
- **cwtv:movie**
|
||||||
- **Cybrary**: [*cybrary*](## "netrc machine")
|
- **Cybrary**: [*cybrary*](## "netrc machine")
|
||||||
- **CybraryCourse**: [*cybrary*](## "netrc machine")
|
- **CybraryCourse**: [*cybrary*](## "netrc machine")
|
||||||
- **DacastPlaylist**
|
- **DacastPlaylist**
|
||||||
@ -349,6 +356,7 @@ # Supported sites
|
|||||||
- **DigitalConcertHall**: [*digitalconcerthall*](## "netrc machine") DigitalConcertHall extractor
|
- **DigitalConcertHall**: [*digitalconcerthall*](## "netrc machine") DigitalConcertHall extractor
|
||||||
- **DigitallySpeaking**
|
- **DigitallySpeaking**
|
||||||
- **Digiteka**
|
- **Digiteka**
|
||||||
|
- **Digiview**
|
||||||
- **DiscogsReleasePlaylist**
|
- **DiscogsReleasePlaylist**
|
||||||
- **DiscoveryLife**
|
- **DiscoveryLife**
|
||||||
- **DiscoveryNetworksDe**
|
- **DiscoveryNetworksDe**
|
||||||
@ -465,9 +473,9 @@ # Supported sites
|
|||||||
- **fptplay**: fptplay.vn
|
- **fptplay**: fptplay.vn
|
||||||
- **FranceCulture**
|
- **FranceCulture**
|
||||||
- **FranceInter**
|
- **FranceInter**
|
||||||
- **FranceTV**
|
- **francetv**
|
||||||
|
- **francetv:site**
|
||||||
- **francetvinfo.fr**
|
- **francetvinfo.fr**
|
||||||
- **FranceTVSite**
|
|
||||||
- **Freesound**
|
- **Freesound**
|
||||||
- **freespeech.org**
|
- **freespeech.org**
|
||||||
- **freetv:series**
|
- **freetv:series**
|
||||||
@ -499,7 +507,7 @@ # Supported sites
|
|||||||
- **GediDigital**
|
- **GediDigital**
|
||||||
- **gem.cbc.ca**: [*cbcgem*](## "netrc machine")
|
- **gem.cbc.ca**: [*cbcgem*](## "netrc machine")
|
||||||
- **gem.cbc.ca:live**
|
- **gem.cbc.ca:live**
|
||||||
- **gem.cbc.ca:playlist**
|
- **gem.cbc.ca:playlist**: [*cbcgem*](## "netrc machine")
|
||||||
- **Genius**
|
- **Genius**
|
||||||
- **GeniusLyrics**
|
- **GeniusLyrics**
|
||||||
- **Germanupa**: germanupa.de
|
- **Germanupa**: germanupa.de
|
||||||
|
198
test/helper.py
198
test/helper.py
@ -101,87 +101,109 @@ def getwebpagetestcases():
|
|||||||
md5 = lambda s: hashlib.md5(s.encode()).hexdigest()
|
md5 = lambda s: hashlib.md5(s.encode()).hexdigest()
|
||||||
|
|
||||||
|
|
||||||
def expect_value(self, got, expected, field):
|
def _iter_differences(got, expected, field):
|
||||||
if isinstance(expected, str) and expected.startswith('re:'):
|
if isinstance(expected, str):
|
||||||
match_str = expected[len('re:'):]
|
op, _, val = expected.partition(':')
|
||||||
match_rex = re.compile(match_str)
|
if op in ('mincount', 'maxcount', 'count'):
|
||||||
|
if not isinstance(got, (list, dict)):
|
||||||
self.assertTrue(
|
yield field, f'expected either {list.__name__} or {dict.__name__}, got {type(got).__name__}'
|
||||||
isinstance(got, str),
|
|
||||||
f'Expected a {str.__name__} object, but got {type(got).__name__} for field {field}')
|
|
||||||
self.assertTrue(
|
|
||||||
match_rex.match(got),
|
|
||||||
f'field {field} (value: {got!r}) should match {match_str!r}')
|
|
||||||
elif isinstance(expected, str) and expected.startswith('startswith:'):
|
|
||||||
start_str = expected[len('startswith:'):]
|
|
||||||
self.assertTrue(
|
|
||||||
isinstance(got, str),
|
|
||||||
f'Expected a {str.__name__} object, but got {type(got).__name__} for field {field}')
|
|
||||||
self.assertTrue(
|
|
||||||
got.startswith(start_str),
|
|
||||||
f'field {field} (value: {got!r}) should start with {start_str!r}')
|
|
||||||
elif isinstance(expected, str) and expected.startswith('contains:'):
|
|
||||||
contains_str = expected[len('contains:'):]
|
|
||||||
self.assertTrue(
|
|
||||||
isinstance(got, str),
|
|
||||||
f'Expected a {str.__name__} object, but got {type(got).__name__} for field {field}')
|
|
||||||
self.assertTrue(
|
|
||||||
contains_str in got,
|
|
||||||
f'field {field} (value: {got!r}) should contain {contains_str!r}')
|
|
||||||
elif isinstance(expected, type):
|
|
||||||
self.assertTrue(
|
|
||||||
isinstance(got, expected),
|
|
||||||
f'Expected type {expected!r} for field {field}, but got value {got!r} of type {type(got)!r}')
|
|
||||||
elif isinstance(expected, dict) and isinstance(got, dict):
|
|
||||||
expect_dict(self, got, expected)
|
|
||||||
elif isinstance(expected, list) and isinstance(got, list):
|
|
||||||
self.assertEqual(
|
|
||||||
len(expected), len(got),
|
|
||||||
f'Expect a list of length {len(expected)}, but got a list of length {len(got)} for field {field}')
|
|
||||||
for index, (item_got, item_expected) in enumerate(zip(got, expected)):
|
|
||||||
type_got = type(item_got)
|
|
||||||
type_expected = type(item_expected)
|
|
||||||
self.assertEqual(
|
|
||||||
type_expected, type_got,
|
|
||||||
f'Type mismatch for list item at index {index} for field {field}, '
|
|
||||||
f'expected {type_expected!r}, got {type_got!r}')
|
|
||||||
expect_value(self, item_got, item_expected, field)
|
|
||||||
else:
|
|
||||||
if isinstance(expected, str) and expected.startswith('md5:'):
|
|
||||||
self.assertTrue(
|
|
||||||
isinstance(got, str),
|
|
||||||
f'Expected field {field} to be a unicode object, but got value {got!r} of type {type(got)!r}')
|
|
||||||
got = 'md5:' + md5(got)
|
|
||||||
elif isinstance(expected, str) and re.match(r'^(?:min|max)?count:\d+', expected):
|
|
||||||
self.assertTrue(
|
|
||||||
isinstance(got, (list, dict)),
|
|
||||||
f'Expected field {field} to be a list or a dict, but it is of type {type(got).__name__}')
|
|
||||||
op, _, expected_num = expected.partition(':')
|
|
||||||
expected_num = int(expected_num)
|
|
||||||
if op == 'mincount':
|
|
||||||
assert_func = assertGreaterEqual
|
|
||||||
msg_tmpl = 'Expected %d items in field %s, but only got %d'
|
|
||||||
elif op == 'maxcount':
|
|
||||||
assert_func = assertLessEqual
|
|
||||||
msg_tmpl = 'Expected maximum %d items in field %s, but got %d'
|
|
||||||
elif op == 'count':
|
|
||||||
assert_func = assertEqual
|
|
||||||
msg_tmpl = 'Expected exactly %d items in field %s, but got %d'
|
|
||||||
else:
|
|
||||||
assert False
|
|
||||||
assert_func(
|
|
||||||
self, len(got), expected_num,
|
|
||||||
msg_tmpl % (expected_num, field, len(got)))
|
|
||||||
return
|
return
|
||||||
self.assertEqual(
|
|
||||||
expected, got,
|
expected_num = int(val)
|
||||||
f'Invalid value for field {field}, expected {expected!r}, got {got!r}')
|
got_num = len(got)
|
||||||
|
if op == 'mincount':
|
||||||
|
if got_num < expected_num:
|
||||||
|
yield field, f'expected at least {val} items, got {got_num}'
|
||||||
|
return
|
||||||
|
|
||||||
|
if op == 'maxcount':
|
||||||
|
if got_num > expected_num:
|
||||||
|
yield field, f'expected at most {val} items, got {got_num}'
|
||||||
|
return
|
||||||
|
|
||||||
|
assert op == 'count'
|
||||||
|
if got_num != expected_num:
|
||||||
|
yield field, f'expected exactly {val} items, got {got_num}'
|
||||||
|
return
|
||||||
|
|
||||||
|
if not isinstance(got, str):
|
||||||
|
yield field, f'expected {str.__name__}, got {type(got).__name__}'
|
||||||
|
return
|
||||||
|
|
||||||
|
if op == 're':
|
||||||
|
if not re.match(val, got):
|
||||||
|
yield field, f'should match {val!r}, got {got!r}'
|
||||||
|
return
|
||||||
|
|
||||||
|
if op == 'startswith':
|
||||||
|
if not val.startswith(got):
|
||||||
|
yield field, f'should start with {val!r}, got {got!r}'
|
||||||
|
return
|
||||||
|
|
||||||
|
if op == 'contains':
|
||||||
|
if not val.startswith(got):
|
||||||
|
yield field, f'should contain {val!r}, got {got!r}'
|
||||||
|
return
|
||||||
|
|
||||||
|
if op == 'md5':
|
||||||
|
hash_val = md5(got)
|
||||||
|
if hash_val != val:
|
||||||
|
yield field, f'expected hash {val}, got {hash_val}'
|
||||||
|
return
|
||||||
|
|
||||||
|
if got != expected:
|
||||||
|
yield field, f'expected {expected!r}, got {got!r}'
|
||||||
|
return
|
||||||
|
|
||||||
|
if isinstance(expected, dict) and isinstance(got, dict):
|
||||||
|
for key, expected_val in expected.items():
|
||||||
|
if key not in got:
|
||||||
|
yield field, f'missing key: {key!r}'
|
||||||
|
continue
|
||||||
|
|
||||||
|
field_name = key if field is None else f'{field}.{key}'
|
||||||
|
yield from _iter_differences(got[key], expected_val, field_name)
|
||||||
|
return
|
||||||
|
|
||||||
|
if isinstance(expected, type):
|
||||||
|
if not isinstance(got, expected):
|
||||||
|
yield field, f'expected {expected.__name__}, got {type(got).__name__}'
|
||||||
|
return
|
||||||
|
|
||||||
|
if isinstance(expected, list) and isinstance(got, list):
|
||||||
|
# TODO: clever diffing algorithm lmao
|
||||||
|
if len(expected) != len(got):
|
||||||
|
yield field, f'expected length of {len(expected)}, got {len(got)}'
|
||||||
|
return
|
||||||
|
|
||||||
|
for index, (got_val, expected_val) in enumerate(zip(got, expected)):
|
||||||
|
field_name = str(index) if field is None else f'{field}.{index}'
|
||||||
|
yield from _iter_differences(got_val, expected_val, field_name)
|
||||||
|
return
|
||||||
|
|
||||||
|
if got != expected:
|
||||||
|
yield field, f'expected {expected!r}, got {got!r}'
|
||||||
|
|
||||||
|
|
||||||
|
def _expect_value(message, got, expected, field):
|
||||||
|
mismatches = list(_iter_differences(got, expected, field))
|
||||||
|
if not mismatches:
|
||||||
|
return
|
||||||
|
|
||||||
|
fields = [field for field, _ in mismatches if field is not None]
|
||||||
|
return ''.join((
|
||||||
|
message, f' ({", ".join(fields)})' if fields else '',
|
||||||
|
*(f'\n\t{field}: {message}' for field, message in mismatches)))
|
||||||
|
|
||||||
|
|
||||||
|
def expect_value(self, got, expected, field):
|
||||||
|
if message := _expect_value('values differ', got, expected, field):
|
||||||
|
self.fail(message)
|
||||||
|
|
||||||
|
|
||||||
def expect_dict(self, got_dict, expected_dict):
|
def expect_dict(self, got_dict, expected_dict):
|
||||||
for info_field, expected in expected_dict.items():
|
if message := _expect_value('dictionaries differ', got_dict, expected_dict, None):
|
||||||
got = got_dict.get(info_field)
|
self.fail(message)
|
||||||
expect_value(self, got, expected, info_field)
|
|
||||||
|
|
||||||
|
|
||||||
def sanitize_got_info_dict(got_dict):
|
def sanitize_got_info_dict(got_dict):
|
||||||
@ -237,6 +259,20 @@ def sanitize(key, value):
|
|||||||
|
|
||||||
|
|
||||||
def expect_info_dict(self, got_dict, expected_dict):
|
def expect_info_dict(self, got_dict, expected_dict):
|
||||||
|
ALLOWED_KEYS_SORT_ORDER = (
|
||||||
|
# NB: Keep in sync with the docstring of extractor/common.py
|
||||||
|
'id', 'ext', 'direct', 'display_id', 'title', 'alt_title', 'description', 'media_type',
|
||||||
|
'uploader', 'uploader_id', 'uploader_url', 'channel', 'channel_id', 'channel_url', 'channel_is_verified',
|
||||||
|
'channel_follower_count', 'comment_count', 'view_count', 'concurrent_view_count',
|
||||||
|
'like_count', 'dislike_count', 'repost_count', 'average_rating', 'age_limit', 'duration', 'thumbnail', 'heatmap',
|
||||||
|
'chapters', 'chapter', 'chapter_number', 'chapter_id', 'start_time', 'end_time', 'section_start', 'section_end',
|
||||||
|
'categories', 'tags', 'cast', 'composers', 'artists', 'album_artists', 'creators', 'genres',
|
||||||
|
'track', 'track_number', 'track_id', 'album', 'album_type', 'disc_number',
|
||||||
|
'series', 'series_id', 'season', 'season_number', 'season_id', 'episode', 'episode_number', 'episode_id',
|
||||||
|
'timestamp', 'upload_date', 'release_timestamp', 'release_date', 'release_year', 'modified_timestamp', 'modified_date',
|
||||||
|
'playable_in_embed', 'availability', 'live_status', 'location', 'license', '_old_archive_ids',
|
||||||
|
)
|
||||||
|
|
||||||
expect_dict(self, got_dict, expected_dict)
|
expect_dict(self, got_dict, expected_dict)
|
||||||
# Check for the presence of mandatory fields
|
# Check for the presence of mandatory fields
|
||||||
if got_dict.get('_type') not in ('playlist', 'multi_video'):
|
if got_dict.get('_type') not in ('playlist', 'multi_video'):
|
||||||
@ -252,7 +288,13 @@ def expect_info_dict(self, got_dict, expected_dict):
|
|||||||
|
|
||||||
test_info_dict = sanitize_got_info_dict(got_dict)
|
test_info_dict = sanitize_got_info_dict(got_dict)
|
||||||
|
|
||||||
missing_keys = set(test_info_dict.keys()) - set(expected_dict.keys())
|
# Check for invalid/misspelled field names being returned by the extractor
|
||||||
|
invalid_keys = sorted(test_info_dict.keys() - ALLOWED_KEYS_SORT_ORDER)
|
||||||
|
self.assertFalse(invalid_keys, f'Invalid fields returned by the extractor: {", ".join(invalid_keys)}')
|
||||||
|
|
||||||
|
missing_keys = sorted(
|
||||||
|
test_info_dict.keys() - expected_dict.keys(),
|
||||||
|
key=lambda x: ALLOWED_KEYS_SORT_ORDER.index(x))
|
||||||
if missing_keys:
|
if missing_keys:
|
||||||
def _repr(v):
|
def _repr(v):
|
||||||
if isinstance(v, str):
|
if isinstance(v, str):
|
||||||
|
@ -6,6 +6,8 @@
|
|||||||
import unittest
|
import unittest
|
||||||
from unittest.mock import patch
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
from yt_dlp.globals import all_plugins_loaded
|
||||||
|
|
||||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||||
|
|
||||||
|
|
||||||
@ -1427,6 +1429,12 @@ def check_for_cookie_header(result):
|
|||||||
self.assertFalse(result.get('cookies'), msg='Cookies set in cookies field for wrong domain')
|
self.assertFalse(result.get('cookies'), msg='Cookies set in cookies field for wrong domain')
|
||||||
self.assertFalse(ydl.cookiejar.get_cookie_header(fmt['url']), msg='Cookies set in cookiejar for wrong domain')
|
self.assertFalse(ydl.cookiejar.get_cookie_header(fmt['url']), msg='Cookies set in cookiejar for wrong domain')
|
||||||
|
|
||||||
|
def test_load_plugins_compat(self):
|
||||||
|
# Should try to reload plugins if they haven't already been loaded
|
||||||
|
all_plugins_loaded.value = False
|
||||||
|
FakeYDL().close()
|
||||||
|
assert all_plugins_loaded.value
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -9,7 +9,7 @@
|
|||||||
|
|
||||||
import math
|
import math
|
||||||
|
|
||||||
from yt_dlp.jsinterp import JS_Undefined, JSInterpreter
|
from yt_dlp.jsinterp import JS_Undefined, JSInterpreter, js_number_to_string
|
||||||
|
|
||||||
|
|
||||||
class NaN:
|
class NaN:
|
||||||
@ -93,6 +93,16 @@ def test_operators(self):
|
|||||||
self._test('function f(){return 0 ?? 42;}', 0)
|
self._test('function f(){return 0 ?? 42;}', 0)
|
||||||
self._test('function f(){return "life, the universe and everything" < 42;}', False)
|
self._test('function f(){return "life, the universe and everything" < 42;}', False)
|
||||||
self._test('function f(){return 0 - 7 * - 6;}', 42)
|
self._test('function f(){return 0 - 7 * - 6;}', 42)
|
||||||
|
self._test('function f(){return true << "5";}', 32)
|
||||||
|
self._test('function f(){return true << true;}', 2)
|
||||||
|
self._test('function f(){return "19" & "21.9";}', 17)
|
||||||
|
self._test('function f(){return "19" & false;}', 0)
|
||||||
|
self._test('function f(){return "11.0" >> "2.1";}', 2)
|
||||||
|
self._test('function f(){return 5 ^ 9;}', 12)
|
||||||
|
self._test('function f(){return 0.0 << NaN}', 0)
|
||||||
|
self._test('function f(){return null << undefined}', 0)
|
||||||
|
# TODO: Does not work due to number too large
|
||||||
|
# self._test('function f(){return 21 << 4294967297}', 42)
|
||||||
|
|
||||||
def test_array_access(self):
|
def test_array_access(self):
|
||||||
self._test('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}', [5, 2, 7])
|
self._test('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}', [5, 2, 7])
|
||||||
@ -431,6 +441,27 @@ def test_slice(self):
|
|||||||
self._test('function f(){return "012345678".slice(-1, 1)}', '')
|
self._test('function f(){return "012345678".slice(-1, 1)}', '')
|
||||||
self._test('function f(){return "012345678".slice(-3, -1)}', '67')
|
self._test('function f(){return "012345678".slice(-3, -1)}', '67')
|
||||||
|
|
||||||
|
def test_js_number_to_string(self):
|
||||||
|
for test, radix, expected in [
|
||||||
|
(0, None, '0'),
|
||||||
|
(-0, None, '0'),
|
||||||
|
(0.0, None, '0'),
|
||||||
|
(-0.0, None, '0'),
|
||||||
|
(math.nan, None, 'NaN'),
|
||||||
|
(-math.nan, None, 'NaN'),
|
||||||
|
(math.inf, None, 'Infinity'),
|
||||||
|
(-math.inf, None, '-Infinity'),
|
||||||
|
(10 ** 21.5, 8, '526665530627250154000000'),
|
||||||
|
(6, 2, '110'),
|
||||||
|
(254, 16, 'fe'),
|
||||||
|
(-10, 2, '-1010'),
|
||||||
|
(-0xff, 2, '-11111111'),
|
||||||
|
(0.1 + 0.2, 16, '0.4cccccccccccd'),
|
||||||
|
(1234.1234, 10, '1234.1234'),
|
||||||
|
# (1000000000000000128, 10, '1000000000000000100')
|
||||||
|
]:
|
||||||
|
assert js_number_to_string(test, radix) == expected
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -720,6 +720,15 @@ def test_allproxy(self, handler):
|
|||||||
rh, Request(
|
rh, Request(
|
||||||
f'http://127.0.0.1:{self.http_port}/headers', proxies={'all': 'http://10.255.255.255'})).close()
|
f'http://127.0.0.1:{self.http_port}/headers', proxies={'all': 'http://10.255.255.255'})).close()
|
||||||
|
|
||||||
|
@pytest.mark.skip_handlers_if(lambda _, handler: handler not in ['Urllib', 'CurlCFFI'], 'handler does not support keep_header_casing')
|
||||||
|
def test_keep_header_casing(self, handler):
|
||||||
|
with handler() as rh:
|
||||||
|
res = validate_and_send(
|
||||||
|
rh, Request(
|
||||||
|
f'http://127.0.0.1:{self.http_port}/headers', headers={'X-test-heaDer': 'test'}, extensions={'keep_header_casing': True})).read().decode()
|
||||||
|
|
||||||
|
assert 'X-test-heaDer: test' in res
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.parametrize('handler', ['Urllib', 'Requests', 'CurlCFFI'], indirect=True)
|
@pytest.mark.parametrize('handler', ['Urllib', 'Requests', 'CurlCFFI'], indirect=True)
|
||||||
class TestClientCertificate:
|
class TestClientCertificate:
|
||||||
@ -1289,6 +1298,7 @@ class HTTPSupportedRH(ValidationRH):
|
|||||||
({'legacy_ssl': False}, False),
|
({'legacy_ssl': False}, False),
|
||||||
({'legacy_ssl': True}, False),
|
({'legacy_ssl': True}, False),
|
||||||
({'legacy_ssl': 'notabool'}, AssertionError),
|
({'legacy_ssl': 'notabool'}, AssertionError),
|
||||||
|
({'keep_header_casing': True}, UnsupportedRequest),
|
||||||
]),
|
]),
|
||||||
('Requests', 'http', [
|
('Requests', 'http', [
|
||||||
({'cookiejar': 'notacookiejar'}, AssertionError),
|
({'cookiejar': 'notacookiejar'}, AssertionError),
|
||||||
@ -1299,6 +1309,9 @@ class HTTPSupportedRH(ValidationRH):
|
|||||||
({'legacy_ssl': False}, False),
|
({'legacy_ssl': False}, False),
|
||||||
({'legacy_ssl': True}, False),
|
({'legacy_ssl': True}, False),
|
||||||
({'legacy_ssl': 'notabool'}, AssertionError),
|
({'legacy_ssl': 'notabool'}, AssertionError),
|
||||||
|
({'keep_header_casing': False}, False),
|
||||||
|
({'keep_header_casing': True}, False),
|
||||||
|
({'keep_header_casing': 'notabool'}, AssertionError),
|
||||||
]),
|
]),
|
||||||
('CurlCFFI', 'http', [
|
('CurlCFFI', 'http', [
|
||||||
({'cookiejar': 'notacookiejar'}, AssertionError),
|
({'cookiejar': 'notacookiejar'}, AssertionError),
|
||||||
|
@ -10,22 +10,71 @@
|
|||||||
sys.path.append(str(TEST_DATA_DIR))
|
sys.path.append(str(TEST_DATA_DIR))
|
||||||
importlib.invalidate_caches()
|
importlib.invalidate_caches()
|
||||||
|
|
||||||
from yt_dlp.utils import Config
|
from yt_dlp.plugins import (
|
||||||
from yt_dlp.plugins import PACKAGE_NAME, directories, load_plugins
|
PACKAGE_NAME,
|
||||||
|
PluginSpec,
|
||||||
|
directories,
|
||||||
|
load_plugins,
|
||||||
|
load_all_plugins,
|
||||||
|
register_plugin_spec,
|
||||||
|
)
|
||||||
|
|
||||||
|
from yt_dlp.globals import (
|
||||||
|
extractors,
|
||||||
|
postprocessors,
|
||||||
|
plugin_dirs,
|
||||||
|
plugin_ies,
|
||||||
|
plugin_pps,
|
||||||
|
all_plugins_loaded,
|
||||||
|
plugin_specs,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
EXTRACTOR_PLUGIN_SPEC = PluginSpec(
|
||||||
|
module_name='extractor',
|
||||||
|
suffix='IE',
|
||||||
|
destination=extractors,
|
||||||
|
plugin_destination=plugin_ies,
|
||||||
|
)
|
||||||
|
|
||||||
|
POSTPROCESSOR_PLUGIN_SPEC = PluginSpec(
|
||||||
|
module_name='postprocessor',
|
||||||
|
suffix='PP',
|
||||||
|
destination=postprocessors,
|
||||||
|
plugin_destination=plugin_pps,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def reset_plugins():
|
||||||
|
plugin_ies.value = {}
|
||||||
|
plugin_pps.value = {}
|
||||||
|
plugin_dirs.value = ['default']
|
||||||
|
plugin_specs.value = {}
|
||||||
|
all_plugins_loaded.value = False
|
||||||
|
# Clearing override plugins is probably difficult
|
||||||
|
for module_name in tuple(sys.modules):
|
||||||
|
for plugin_type in ('extractor', 'postprocessor'):
|
||||||
|
if module_name.startswith(f'{PACKAGE_NAME}.{plugin_type}.'):
|
||||||
|
del sys.modules[module_name]
|
||||||
|
|
||||||
|
importlib.invalidate_caches()
|
||||||
|
|
||||||
|
|
||||||
class TestPlugins(unittest.TestCase):
|
class TestPlugins(unittest.TestCase):
|
||||||
|
|
||||||
TEST_PLUGIN_DIR = TEST_DATA_DIR / PACKAGE_NAME
|
TEST_PLUGIN_DIR = TEST_DATA_DIR / PACKAGE_NAME
|
||||||
|
|
||||||
|
def setUp(self):
|
||||||
|
reset_plugins()
|
||||||
|
|
||||||
|
def tearDown(self):
|
||||||
|
reset_plugins()
|
||||||
|
|
||||||
def test_directories_containing_plugins(self):
|
def test_directories_containing_plugins(self):
|
||||||
self.assertIn(self.TEST_PLUGIN_DIR, map(Path, directories()))
|
self.assertIn(self.TEST_PLUGIN_DIR, map(Path, directories()))
|
||||||
|
|
||||||
def test_extractor_classes(self):
|
def test_extractor_classes(self):
|
||||||
for module_name in tuple(sys.modules):
|
plugins_ie = load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||||
if module_name.startswith(f'{PACKAGE_NAME}.extractor'):
|
|
||||||
del sys.modules[module_name]
|
|
||||||
plugins_ie = load_plugins('extractor', 'IE')
|
|
||||||
|
|
||||||
self.assertIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
|
self.assertIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
|
||||||
self.assertIn('NormalPluginIE', plugins_ie.keys())
|
self.assertIn('NormalPluginIE', plugins_ie.keys())
|
||||||
@ -35,17 +84,29 @@ def test_extractor_classes(self):
|
|||||||
f'{PACKAGE_NAME}.extractor._ignore' in sys.modules,
|
f'{PACKAGE_NAME}.extractor._ignore' in sys.modules,
|
||||||
'loaded module beginning with underscore')
|
'loaded module beginning with underscore')
|
||||||
self.assertNotIn('IgnorePluginIE', plugins_ie.keys())
|
self.assertNotIn('IgnorePluginIE', plugins_ie.keys())
|
||||||
|
self.assertNotIn('IgnorePluginIE', plugin_ies.value)
|
||||||
|
|
||||||
# Don't load extractors with underscore prefix
|
# Don't load extractors with underscore prefix
|
||||||
self.assertNotIn('_IgnoreUnderscorePluginIE', plugins_ie.keys())
|
self.assertNotIn('_IgnoreUnderscorePluginIE', plugins_ie.keys())
|
||||||
|
self.assertNotIn('_IgnoreUnderscorePluginIE', plugin_ies.value)
|
||||||
|
|
||||||
# Don't load extractors not specified in __all__ (if supplied)
|
# Don't load extractors not specified in __all__ (if supplied)
|
||||||
self.assertNotIn('IgnoreNotInAllPluginIE', plugins_ie.keys())
|
self.assertNotIn('IgnoreNotInAllPluginIE', plugins_ie.keys())
|
||||||
|
self.assertNotIn('IgnoreNotInAllPluginIE', plugin_ies.value)
|
||||||
self.assertIn('InAllPluginIE', plugins_ie.keys())
|
self.assertIn('InAllPluginIE', plugins_ie.keys())
|
||||||
|
self.assertIn('InAllPluginIE', plugin_ies.value)
|
||||||
|
|
||||||
|
# Don't load override extractors
|
||||||
|
self.assertNotIn('OverrideGenericIE', plugins_ie.keys())
|
||||||
|
self.assertNotIn('OverrideGenericIE', plugin_ies.value)
|
||||||
|
self.assertNotIn('_UnderscoreOverrideGenericIE', plugins_ie.keys())
|
||||||
|
self.assertNotIn('_UnderscoreOverrideGenericIE', plugin_ies.value)
|
||||||
|
|
||||||
def test_postprocessor_classes(self):
|
def test_postprocessor_classes(self):
|
||||||
plugins_pp = load_plugins('postprocessor', 'PP')
|
plugins_pp = load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
|
||||||
self.assertIn('NormalPluginPP', plugins_pp.keys())
|
self.assertIn('NormalPluginPP', plugins_pp.keys())
|
||||||
|
self.assertIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
|
||||||
|
self.assertIn('NormalPluginPP', plugin_pps.value)
|
||||||
|
|
||||||
def test_importing_zipped_module(self):
|
def test_importing_zipped_module(self):
|
||||||
zip_path = TEST_DATA_DIR / 'zipped_plugins.zip'
|
zip_path = TEST_DATA_DIR / 'zipped_plugins.zip'
|
||||||
@ -58,10 +119,10 @@ def test_importing_zipped_module(self):
|
|||||||
package = importlib.import_module(f'{PACKAGE_NAME}.{plugin_type}')
|
package = importlib.import_module(f'{PACKAGE_NAME}.{plugin_type}')
|
||||||
self.assertIn(zip_path / PACKAGE_NAME / plugin_type, map(Path, package.__path__))
|
self.assertIn(zip_path / PACKAGE_NAME / plugin_type, map(Path, package.__path__))
|
||||||
|
|
||||||
plugins_ie = load_plugins('extractor', 'IE')
|
plugins_ie = load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||||
self.assertIn('ZippedPluginIE', plugins_ie.keys())
|
self.assertIn('ZippedPluginIE', plugins_ie.keys())
|
||||||
|
|
||||||
plugins_pp = load_plugins('postprocessor', 'PP')
|
plugins_pp = load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
|
||||||
self.assertIn('ZippedPluginPP', plugins_pp.keys())
|
self.assertIn('ZippedPluginPP', plugins_pp.keys())
|
||||||
|
|
||||||
finally:
|
finally:
|
||||||
@ -69,23 +130,116 @@ def test_importing_zipped_module(self):
|
|||||||
os.remove(zip_path)
|
os.remove(zip_path)
|
||||||
importlib.invalidate_caches() # reset the import caches
|
importlib.invalidate_caches() # reset the import caches
|
||||||
|
|
||||||
def test_plugin_dirs(self):
|
def test_reloading_plugins(self):
|
||||||
# Internal plugin dirs hack for CLI --plugin-dirs
|
reload_plugins_path = TEST_DATA_DIR / 'reload_plugins'
|
||||||
# To be replaced with proper system later
|
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||||
custom_plugin_dir = TEST_DATA_DIR / 'plugin_packages'
|
load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
|
||||||
Config._plugin_dirs = [str(custom_plugin_dir)]
|
|
||||||
importlib.invalidate_caches() # reset the import caches
|
|
||||||
|
|
||||||
|
# Remove default folder and add reload_plugin path
|
||||||
|
sys.path.remove(str(TEST_DATA_DIR))
|
||||||
|
sys.path.append(str(reload_plugins_path))
|
||||||
|
importlib.invalidate_caches()
|
||||||
try:
|
try:
|
||||||
package = importlib.import_module(f'{PACKAGE_NAME}.extractor')
|
for plugin_type in ('extractor', 'postprocessor'):
|
||||||
self.assertIn(custom_plugin_dir / 'testpackage' / PACKAGE_NAME / 'extractor', map(Path, package.__path__))
|
package = importlib.import_module(f'{PACKAGE_NAME}.{plugin_type}')
|
||||||
|
self.assertIn(reload_plugins_path / PACKAGE_NAME / plugin_type, map(Path, package.__path__))
|
||||||
|
|
||||||
plugins_ie = load_plugins('extractor', 'IE')
|
plugins_ie = load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||||
self.assertIn('PackagePluginIE', plugins_ie.keys())
|
self.assertIn('NormalPluginIE', plugins_ie.keys())
|
||||||
|
self.assertTrue(
|
||||||
|
plugins_ie['NormalPluginIE'].REPLACED,
|
||||||
|
msg='Reloading has not replaced original extractor plugin')
|
||||||
|
self.assertTrue(
|
||||||
|
extractors.value['NormalPluginIE'].REPLACED,
|
||||||
|
msg='Reloading has not replaced original extractor plugin globally')
|
||||||
|
|
||||||
|
plugins_pp = load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
|
||||||
|
self.assertIn('NormalPluginPP', plugins_pp.keys())
|
||||||
|
self.assertTrue(plugins_pp['NormalPluginPP'].REPLACED,
|
||||||
|
msg='Reloading has not replaced original postprocessor plugin')
|
||||||
|
self.assertTrue(
|
||||||
|
postprocessors.value['NormalPluginPP'].REPLACED,
|
||||||
|
msg='Reloading has not replaced original postprocessor plugin globally')
|
||||||
|
|
||||||
finally:
|
finally:
|
||||||
Config._plugin_dirs = []
|
sys.path.remove(str(reload_plugins_path))
|
||||||
importlib.invalidate_caches() # reset the import caches
|
sys.path.append(str(TEST_DATA_DIR))
|
||||||
|
importlib.invalidate_caches()
|
||||||
|
|
||||||
|
def test_extractor_override_plugin(self):
|
||||||
|
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||||
|
|
||||||
|
from yt_dlp.extractor.generic import GenericIE
|
||||||
|
|
||||||
|
self.assertEqual(GenericIE.TEST_FIELD, 'override')
|
||||||
|
self.assertEqual(GenericIE.SECONDARY_TEST_FIELD, 'underscore-override')
|
||||||
|
|
||||||
|
self.assertEqual(GenericIE.IE_NAME, 'generic+override+underscore-override')
|
||||||
|
importlib.invalidate_caches()
|
||||||
|
# test that loading a second time doesn't wrap a second time
|
||||||
|
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||||
|
from yt_dlp.extractor.generic import GenericIE
|
||||||
|
self.assertEqual(GenericIE.IE_NAME, 'generic+override+underscore-override')
|
||||||
|
|
||||||
|
def test_load_all_plugin_types(self):
|
||||||
|
|
||||||
|
# no plugin specs registered
|
||||||
|
load_all_plugins()
|
||||||
|
|
||||||
|
self.assertNotIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
|
||||||
|
self.assertNotIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
|
||||||
|
|
||||||
|
register_plugin_spec(EXTRACTOR_PLUGIN_SPEC)
|
||||||
|
register_plugin_spec(POSTPROCESSOR_PLUGIN_SPEC)
|
||||||
|
load_all_plugins()
|
||||||
|
self.assertTrue(all_plugins_loaded.value)
|
||||||
|
|
||||||
|
self.assertIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
|
||||||
|
self.assertIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
|
||||||
|
|
||||||
|
def test_no_plugin_dirs(self):
|
||||||
|
register_plugin_spec(EXTRACTOR_PLUGIN_SPEC)
|
||||||
|
register_plugin_spec(POSTPROCESSOR_PLUGIN_SPEC)
|
||||||
|
|
||||||
|
plugin_dirs.value = []
|
||||||
|
load_all_plugins()
|
||||||
|
|
||||||
|
self.assertNotIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
|
||||||
|
self.assertNotIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
|
||||||
|
|
||||||
|
def test_set_plugin_dirs(self):
|
||||||
|
custom_plugin_dir = str(TEST_DATA_DIR / 'plugin_packages')
|
||||||
|
plugin_dirs.value = [custom_plugin_dir]
|
||||||
|
|
||||||
|
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||||
|
|
||||||
|
self.assertIn(f'{PACKAGE_NAME}.extractor.package', sys.modules.keys())
|
||||||
|
self.assertIn('PackagePluginIE', plugin_ies.value)
|
||||||
|
|
||||||
|
def test_invalid_plugin_dir(self):
|
||||||
|
plugin_dirs.value = ['invalid_dir']
|
||||||
|
with self.assertRaises(ValueError):
|
||||||
|
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||||
|
|
||||||
|
def test_append_plugin_dirs(self):
|
||||||
|
custom_plugin_dir = str(TEST_DATA_DIR / 'plugin_packages')
|
||||||
|
|
||||||
|
self.assertEqual(plugin_dirs.value, ['default'])
|
||||||
|
plugin_dirs.value.append(custom_plugin_dir)
|
||||||
|
self.assertEqual(plugin_dirs.value, ['default', custom_plugin_dir])
|
||||||
|
|
||||||
|
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||||
|
|
||||||
|
self.assertIn(f'{PACKAGE_NAME}.extractor.package', sys.modules.keys())
|
||||||
|
self.assertIn('PackagePluginIE', plugin_ies.value)
|
||||||
|
|
||||||
|
def test_get_plugin_spec(self):
|
||||||
|
register_plugin_spec(EXTRACTOR_PLUGIN_SPEC)
|
||||||
|
register_plugin_spec(POSTPROCESSOR_PLUGIN_SPEC)
|
||||||
|
|
||||||
|
self.assertEqual(plugin_specs.value.get('extractor'), EXTRACTOR_PLUGIN_SPEC)
|
||||||
|
self.assertEqual(plugin_specs.value.get('postprocessor'), POSTPROCESSOR_PLUGIN_SPEC)
|
||||||
|
self.assertIsNone(plugin_specs.value.get('invalid'))
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
|
@ -3,19 +3,20 @@
|
|||||||
# Allow direct execution
|
# Allow direct execution
|
||||||
import os
|
import os
|
||||||
import sys
|
import sys
|
||||||
import unittest
|
|
||||||
import unittest.mock
|
|
||||||
import warnings
|
|
||||||
import datetime as dt
|
|
||||||
|
|
||||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||||
|
|
||||||
|
|
||||||
import contextlib
|
import contextlib
|
||||||
|
import datetime as dt
|
||||||
import io
|
import io
|
||||||
import itertools
|
import itertools
|
||||||
import json
|
import json
|
||||||
|
import pickle
|
||||||
import subprocess
|
import subprocess
|
||||||
|
import unittest
|
||||||
|
import unittest.mock
|
||||||
|
import warnings
|
||||||
import xml.etree.ElementTree
|
import xml.etree.ElementTree
|
||||||
|
|
||||||
from yt_dlp.compat import (
|
from yt_dlp.compat import (
|
||||||
@ -2087,21 +2088,26 @@ def test_http_header_dict(self):
|
|||||||
headers = HTTPHeaderDict()
|
headers = HTTPHeaderDict()
|
||||||
headers['ytdl-test'] = b'0'
|
headers['ytdl-test'] = b'0'
|
||||||
self.assertEqual(list(headers.items()), [('Ytdl-Test', '0')])
|
self.assertEqual(list(headers.items()), [('Ytdl-Test', '0')])
|
||||||
|
self.assertEqual(list(headers.sensitive().items()), [('ytdl-test', '0')])
|
||||||
headers['ytdl-test'] = 1
|
headers['ytdl-test'] = 1
|
||||||
self.assertEqual(list(headers.items()), [('Ytdl-Test', '1')])
|
self.assertEqual(list(headers.items()), [('Ytdl-Test', '1')])
|
||||||
|
self.assertEqual(list(headers.sensitive().items()), [('ytdl-test', '1')])
|
||||||
headers['Ytdl-test'] = '2'
|
headers['Ytdl-test'] = '2'
|
||||||
self.assertEqual(list(headers.items()), [('Ytdl-Test', '2')])
|
self.assertEqual(list(headers.items()), [('Ytdl-Test', '2')])
|
||||||
|
self.assertEqual(list(headers.sensitive().items()), [('Ytdl-test', '2')])
|
||||||
self.assertTrue('ytDl-Test' in headers)
|
self.assertTrue('ytDl-Test' in headers)
|
||||||
self.assertEqual(str(headers), str(dict(headers)))
|
self.assertEqual(str(headers), str(dict(headers)))
|
||||||
self.assertEqual(repr(headers), str(dict(headers)))
|
self.assertEqual(repr(headers), str(dict(headers)))
|
||||||
|
|
||||||
headers.update({'X-dlp': 'data'})
|
headers.update({'X-dlp': 'data'})
|
||||||
self.assertEqual(set(headers.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data')})
|
self.assertEqual(set(headers.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data')})
|
||||||
|
self.assertEqual(set(headers.sensitive().items()), {('Ytdl-test', '2'), ('X-dlp', 'data')})
|
||||||
self.assertEqual(dict(headers), {'Ytdl-Test': '2', 'X-Dlp': 'data'})
|
self.assertEqual(dict(headers), {'Ytdl-Test': '2', 'X-Dlp': 'data'})
|
||||||
self.assertEqual(len(headers), 2)
|
self.assertEqual(len(headers), 2)
|
||||||
self.assertEqual(headers.copy(), headers)
|
self.assertEqual(headers.copy(), headers)
|
||||||
headers2 = HTTPHeaderDict({'X-dlp': 'data3'}, **headers, **{'X-dlp': 'data2'})
|
headers2 = HTTPHeaderDict({'X-dlp': 'data3'}, headers, **{'X-dlP': 'data2'})
|
||||||
self.assertEqual(set(headers2.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data2')})
|
self.assertEqual(set(headers2.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data2')})
|
||||||
|
self.assertEqual(set(headers2.sensitive().items()), {('Ytdl-test', '2'), ('X-dlP', 'data2')})
|
||||||
self.assertEqual(len(headers2), 2)
|
self.assertEqual(len(headers2), 2)
|
||||||
headers2.clear()
|
headers2.clear()
|
||||||
self.assertEqual(len(headers2), 0)
|
self.assertEqual(len(headers2), 0)
|
||||||
@ -2109,16 +2115,23 @@ def test_http_header_dict(self):
|
|||||||
# ensure we prefer latter headers
|
# ensure we prefer latter headers
|
||||||
headers3 = HTTPHeaderDict({'Ytdl-TeSt': 1}, {'Ytdl-test': 2})
|
headers3 = HTTPHeaderDict({'Ytdl-TeSt': 1}, {'Ytdl-test': 2})
|
||||||
self.assertEqual(set(headers3.items()), {('Ytdl-Test', '2')})
|
self.assertEqual(set(headers3.items()), {('Ytdl-Test', '2')})
|
||||||
|
self.assertEqual(set(headers3.sensitive().items()), {('Ytdl-test', '2')})
|
||||||
del headers3['ytdl-tesT']
|
del headers3['ytdl-tesT']
|
||||||
self.assertEqual(dict(headers3), {})
|
self.assertEqual(dict(headers3), {})
|
||||||
|
|
||||||
headers4 = HTTPHeaderDict({'ytdl-test': 'data;'})
|
headers4 = HTTPHeaderDict({'ytdl-test': 'data;'})
|
||||||
self.assertEqual(set(headers4.items()), {('Ytdl-Test', 'data;')})
|
self.assertEqual(set(headers4.items()), {('Ytdl-Test', 'data;')})
|
||||||
|
self.assertEqual(set(headers4.sensitive().items()), {('ytdl-test', 'data;')})
|
||||||
|
|
||||||
# common mistake: strip whitespace from values
|
# common mistake: strip whitespace from values
|
||||||
# https://github.com/yt-dlp/yt-dlp/issues/8729
|
# https://github.com/yt-dlp/yt-dlp/issues/8729
|
||||||
headers5 = HTTPHeaderDict({'ytdl-test': ' data; '})
|
headers5 = HTTPHeaderDict({'ytdl-test': ' data; '})
|
||||||
self.assertEqual(set(headers5.items()), {('Ytdl-Test', 'data;')})
|
self.assertEqual(set(headers5.items()), {('Ytdl-Test', 'data;')})
|
||||||
|
self.assertEqual(set(headers5.sensitive().items()), {('ytdl-test', 'data;')})
|
||||||
|
|
||||||
|
# test if picklable
|
||||||
|
headers6 = HTTPHeaderDict(a=1, b=2)
|
||||||
|
self.assertEqual(pickle.loads(pickle.dumps(headers6)), headers6)
|
||||||
|
|
||||||
def test_extract_basic_auth(self):
|
def test_extract_basic_auth(self):
|
||||||
assert extract_basic_auth('http://:foo.bar') == ('http://:foo.bar', None)
|
assert extract_basic_auth('http://:foo.bar') == ('http://:foo.bar', None)
|
||||||
|
@ -44,7 +44,7 @@ def websocket_handler(websocket):
|
|||||||
return websocket.send('2')
|
return websocket.send('2')
|
||||||
elif isinstance(message, str):
|
elif isinstance(message, str):
|
||||||
if message == 'headers':
|
if message == 'headers':
|
||||||
return websocket.send(json.dumps(dict(websocket.request.headers)))
|
return websocket.send(json.dumps(dict(websocket.request.headers.raw_items())))
|
||||||
elif message == 'path':
|
elif message == 'path':
|
||||||
return websocket.send(websocket.request.path)
|
return websocket.send(websocket.request.path)
|
||||||
elif message == 'source_address':
|
elif message == 'source_address':
|
||||||
@ -266,18 +266,18 @@ def test_cookies(self, handler):
|
|||||||
with handler(cookiejar=cookiejar) as rh:
|
with handler(cookiejar=cookiejar) as rh:
|
||||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
||||||
ws.send('headers')
|
ws.send('headers')
|
||||||
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
|
assert HTTPHeaderDict(json.loads(ws.recv()))['cookie'] == 'test=ytdlp'
|
||||||
ws.close()
|
ws.close()
|
||||||
|
|
||||||
with handler() as rh:
|
with handler() as rh:
|
||||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
||||||
ws.send('headers')
|
ws.send('headers')
|
||||||
assert 'cookie' not in json.loads(ws.recv())
|
assert 'cookie' not in HTTPHeaderDict(json.loads(ws.recv()))
|
||||||
ws.close()
|
ws.close()
|
||||||
|
|
||||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': cookiejar}))
|
ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': cookiejar}))
|
||||||
ws.send('headers')
|
ws.send('headers')
|
||||||
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
|
assert HTTPHeaderDict(json.loads(ws.recv()))['cookie'] == 'test=ytdlp'
|
||||||
ws.close()
|
ws.close()
|
||||||
|
|
||||||
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
|
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
|
||||||
@ -287,7 +287,7 @@ def test_cookie_sync_only_cookiejar(self, handler):
|
|||||||
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie', extensions={'cookiejar': YoutubeDLCookieJar()}))
|
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie', extensions={'cookiejar': YoutubeDLCookieJar()}))
|
||||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': YoutubeDLCookieJar()}))
|
ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': YoutubeDLCookieJar()}))
|
||||||
ws.send('headers')
|
ws.send('headers')
|
||||||
assert 'cookie' not in json.loads(ws.recv())
|
assert 'cookie' not in HTTPHeaderDict(json.loads(ws.recv()))
|
||||||
ws.close()
|
ws.close()
|
||||||
|
|
||||||
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
|
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
|
||||||
@ -298,12 +298,12 @@ def test_cookie_sync_delete_cookie(self, handler):
|
|||||||
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie'))
|
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie'))
|
||||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
||||||
ws.send('headers')
|
ws.send('headers')
|
||||||
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
|
assert HTTPHeaderDict(json.loads(ws.recv()))['cookie'] == 'test=ytdlp'
|
||||||
ws.close()
|
ws.close()
|
||||||
cookiejar.clear_session_cookies()
|
cookiejar.clear_session_cookies()
|
||||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
||||||
ws.send('headers')
|
ws.send('headers')
|
||||||
assert 'cookie' not in json.loads(ws.recv())
|
assert 'cookie' not in HTTPHeaderDict(json.loads(ws.recv()))
|
||||||
ws.close()
|
ws.close()
|
||||||
|
|
||||||
def test_source_address(self, handler):
|
def test_source_address(self, handler):
|
||||||
@ -341,6 +341,14 @@ def test_request_headers(self, handler):
|
|||||||
assert headers['test3'] == 'test3'
|
assert headers['test3'] == 'test3'
|
||||||
ws.close()
|
ws.close()
|
||||||
|
|
||||||
|
def test_keep_header_casing(self, handler):
|
||||||
|
with handler(headers=HTTPHeaderDict({'x-TeSt1': 'test'})) as rh:
|
||||||
|
ws = ws_validate_and_send(rh, Request(self.ws_base_url, headers={'x-TeSt2': 'test'}, extensions={'keep_header_casing': True}))
|
||||||
|
ws.send('headers')
|
||||||
|
headers = json.loads(ws.recv())
|
||||||
|
assert 'x-TeSt1' in headers
|
||||||
|
assert 'x-TeSt2' in headers
|
||||||
|
|
||||||
@pytest.mark.parametrize('client_cert', (
|
@pytest.mark.parametrize('client_cert', (
|
||||||
{'client_certificate': os.path.join(MTLS_CERT_DIR, 'clientwithkey.crt')},
|
{'client_certificate': os.path.join(MTLS_CERT_DIR, 'clientwithkey.crt')},
|
||||||
{
|
{
|
||||||
|
@ -201,6 +201,10 @@
|
|||||||
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
|
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
|
||||||
'YWt1qdbe8SAfkoPHW5d', 'RrRjWQOJmBiP',
|
'YWt1qdbe8SAfkoPHW5d', 'RrRjWQOJmBiP',
|
||||||
),
|
),
|
||||||
|
(
|
||||||
|
'https://www.youtube.com/s/player/9c6dfc4a/player_ias.vflset/en_US/base.js',
|
||||||
|
'jbu7ylIosQHyJyJV', 'uwI0ESiynAmhNg',
|
||||||
|
),
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
|
@ -2,4 +2,5 @@
|
|||||||
|
|
||||||
|
|
||||||
class PackagePluginIE(InfoExtractor):
|
class PackagePluginIE(InfoExtractor):
|
||||||
|
_VALID_URL = 'package'
|
||||||
pass
|
pass
|
||||||
|
10
test/testdata/reload_plugins/yt_dlp_plugins/extractor/normal.py
vendored
Normal file
10
test/testdata/reload_plugins/yt_dlp_plugins/extractor/normal.py
vendored
Normal file
@ -0,0 +1,10 @@
|
|||||||
|
from yt_dlp.extractor.common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
|
class NormalPluginIE(InfoExtractor):
|
||||||
|
_VALID_URL = 'normal'
|
||||||
|
REPLACED = True
|
||||||
|
|
||||||
|
|
||||||
|
class _IgnoreUnderscorePluginIE(InfoExtractor):
|
||||||
|
pass
|
5
test/testdata/reload_plugins/yt_dlp_plugins/postprocessor/normal.py
vendored
Normal file
5
test/testdata/reload_plugins/yt_dlp_plugins/postprocessor/normal.py
vendored
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
from yt_dlp.postprocessor.common import PostProcessor
|
||||||
|
|
||||||
|
|
||||||
|
class NormalPluginPP(PostProcessor):
|
||||||
|
REPLACED = True
|
@ -6,6 +6,7 @@ class IgnoreNotInAllPluginIE(InfoExtractor):
|
|||||||
|
|
||||||
|
|
||||||
class InAllPluginIE(InfoExtractor):
|
class InAllPluginIE(InfoExtractor):
|
||||||
|
_VALID_URL = 'inallpluginie'
|
||||||
pass
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
@ -2,8 +2,10 @@
|
|||||||
|
|
||||||
|
|
||||||
class NormalPluginIE(InfoExtractor):
|
class NormalPluginIE(InfoExtractor):
|
||||||
pass
|
_VALID_URL = 'normalpluginie'
|
||||||
|
REPLACED = False
|
||||||
|
|
||||||
|
|
||||||
class _IgnoreUnderscorePluginIE(InfoExtractor):
|
class _IgnoreUnderscorePluginIE(InfoExtractor):
|
||||||
|
_VALID_URL = 'ignoreunderscorepluginie'
|
||||||
pass
|
pass
|
||||||
|
5
test/testdata/yt_dlp_plugins/extractor/override.py
vendored
Normal file
5
test/testdata/yt_dlp_plugins/extractor/override.py
vendored
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
from yt_dlp.extractor.generic import GenericIE
|
||||||
|
|
||||||
|
|
||||||
|
class OverrideGenericIE(GenericIE, plugin_name='override'):
|
||||||
|
TEST_FIELD = 'override'
|
5
test/testdata/yt_dlp_plugins/extractor/overridetwo.py
vendored
Normal file
5
test/testdata/yt_dlp_plugins/extractor/overridetwo.py
vendored
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
from yt_dlp.extractor.generic import GenericIE
|
||||||
|
|
||||||
|
|
||||||
|
class _UnderscoreOverrideGenericIE(GenericIE, plugin_name='underscore-override'):
|
||||||
|
SECONDARY_TEST_FIELD = 'underscore-override'
|
@ -2,4 +2,4 @@
|
|||||||
|
|
||||||
|
|
||||||
class NormalPluginPP(PostProcessor):
|
class NormalPluginPP(PostProcessor):
|
||||||
pass
|
REPLACED = False
|
||||||
|
@ -2,4 +2,5 @@
|
|||||||
|
|
||||||
|
|
||||||
class ZippedPluginIE(InfoExtractor):
|
class ZippedPluginIE(InfoExtractor):
|
||||||
|
_VALID_URL = 'zippedpluginie'
|
||||||
pass
|
pass
|
||||||
|
@ -30,9 +30,18 @@
|
|||||||
from .cookies import CookieLoadError, LenientSimpleCookie, load_cookies
|
from .cookies import CookieLoadError, LenientSimpleCookie, load_cookies
|
||||||
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
|
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
|
||||||
from .downloader.rtmp import rtmpdump_version
|
from .downloader.rtmp import rtmpdump_version
|
||||||
from .extractor import gen_extractor_classes, get_info_extractor
|
from .extractor import gen_extractor_classes, get_info_extractor, import_extractors
|
||||||
from .extractor.common import UnsupportedURLIE
|
from .extractor.common import UnsupportedURLIE
|
||||||
from .extractor.openload import PhantomJSwrapper
|
from .extractor.openload import PhantomJSwrapper
|
||||||
|
from .globals import (
|
||||||
|
IN_CLI,
|
||||||
|
LAZY_EXTRACTORS,
|
||||||
|
plugin_ies,
|
||||||
|
plugin_ies_overrides,
|
||||||
|
plugin_pps,
|
||||||
|
all_plugins_loaded,
|
||||||
|
plugin_dirs,
|
||||||
|
)
|
||||||
from .minicurses import format_text
|
from .minicurses import format_text
|
||||||
from .networking import HEADRequest, Request, RequestDirector
|
from .networking import HEADRequest, Request, RequestDirector
|
||||||
from .networking.common import _REQUEST_HANDLERS, _RH_PREFERENCES
|
from .networking.common import _REQUEST_HANDLERS, _RH_PREFERENCES
|
||||||
@ -44,8 +53,7 @@
|
|||||||
network_exceptions,
|
network_exceptions,
|
||||||
)
|
)
|
||||||
from .networking.impersonate import ImpersonateRequestHandler
|
from .networking.impersonate import ImpersonateRequestHandler
|
||||||
from .plugins import directories as plugin_directories
|
from .plugins import directories as plugin_directories, load_all_plugins
|
||||||
from .postprocessor import _PLUGIN_CLASSES as plugin_pps
|
|
||||||
from .postprocessor import (
|
from .postprocessor import (
|
||||||
EmbedThumbnailPP,
|
EmbedThumbnailPP,
|
||||||
FFmpegFixupDuplicateMoovPP,
|
FFmpegFixupDuplicateMoovPP,
|
||||||
@ -158,7 +166,7 @@
|
|||||||
write_json_file,
|
write_json_file,
|
||||||
write_string,
|
write_string,
|
||||||
)
|
)
|
||||||
from .utils._utils import _UnsafeExtensionError, _YDLLogger
|
from .utils._utils import _UnsafeExtensionError, _YDLLogger, _ProgressState
|
||||||
from .utils.networking import (
|
from .utils.networking import (
|
||||||
HTTPHeaderDict,
|
HTTPHeaderDict,
|
||||||
clean_headers,
|
clean_headers,
|
||||||
@ -599,7 +607,7 @@ class YoutubeDL:
|
|||||||
# NB: Keep in sync with the docstring of extractor/common.py
|
# NB: Keep in sync with the docstring of extractor/common.py
|
||||||
'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note',
|
'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note',
|
||||||
'width', 'height', 'aspect_ratio', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'audio_channels',
|
'width', 'height', 'aspect_ratio', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'audio_channels',
|
||||||
'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'rows', 'columns',
|
'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'rows', 'columns', 'hls_media_playlist_data',
|
||||||
'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start', 'is_dash_periods', 'request_data',
|
'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start', 'is_dash_periods', 'request_data',
|
||||||
'preference', 'language', 'language_preference', 'quality', 'source_preference', 'cookies',
|
'preference', 'language', 'language_preference', 'quality', 'source_preference', 'cookies',
|
||||||
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'extra_param_to_key_url',
|
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'extra_param_to_key_url',
|
||||||
@ -643,20 +651,23 @@ def __init__(self, params=None, auto_init=True):
|
|||||||
self.cache = Cache(self)
|
self.cache = Cache(self)
|
||||||
self.__header_cookies = []
|
self.__header_cookies = []
|
||||||
|
|
||||||
stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout
|
# compat for API: load plugins if they have not already
|
||||||
self._out_files = Namespace(
|
if not all_plugins_loaded.value:
|
||||||
out=stdout,
|
load_all_plugins()
|
||||||
error=sys.stderr,
|
|
||||||
screen=sys.stderr if self.params.get('quiet') else stdout,
|
|
||||||
console=None if os.name == 'nt' else next(
|
|
||||||
filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None),
|
|
||||||
)
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
windows_enable_vt_mode()
|
windows_enable_vt_mode()
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
self.write_debug(f'Failed to enable VT mode: {e}')
|
self.write_debug(f'Failed to enable VT mode: {e}')
|
||||||
|
|
||||||
|
stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout
|
||||||
|
self._out_files = Namespace(
|
||||||
|
out=stdout,
|
||||||
|
error=sys.stderr,
|
||||||
|
screen=sys.stderr if self.params.get('quiet') else stdout,
|
||||||
|
console=next(filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None),
|
||||||
|
)
|
||||||
|
|
||||||
if self.params.get('no_color'):
|
if self.params.get('no_color'):
|
||||||
if self.params.get('color') is not None:
|
if self.params.get('color') is not None:
|
||||||
self.params.setdefault('_warnings', []).append(
|
self.params.setdefault('_warnings', []).append(
|
||||||
@ -957,21 +968,22 @@ def to_stderr(self, message, only_once=False):
|
|||||||
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.error, only_once=only_once)
|
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.error, only_once=only_once)
|
||||||
|
|
||||||
def _send_console_code(self, code):
|
def _send_console_code(self, code):
|
||||||
if os.name == 'nt' or not self._out_files.console:
|
if not supports_terminal_sequences(self._out_files.console):
|
||||||
return
|
return False
|
||||||
self._write_string(code, self._out_files.console)
|
self._write_string(code, self._out_files.console)
|
||||||
|
return True
|
||||||
|
|
||||||
def to_console_title(self, message):
|
def to_console_title(self, message=None, progress_state=None, percent=None):
|
||||||
if not self.params.get('consoletitle', False):
|
if not self.params.get('consoletitle'):
|
||||||
return
|
return
|
||||||
message = remove_terminal_sequences(message)
|
|
||||||
if os.name == 'nt':
|
if message:
|
||||||
if ctypes.windll.kernel32.GetConsoleWindow():
|
success = self._send_console_code(f'\033]0;{remove_terminal_sequences(message)}\007')
|
||||||
# c_wchar_p() might not be necessary if `message` is
|
if not success and os.name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow():
|
||||||
# already of type unicode()
|
ctypes.windll.kernel32.SetConsoleTitleW(message)
|
||||||
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
|
|
||||||
else:
|
if isinstance(progress_state, _ProgressState):
|
||||||
self._send_console_code(f'\033]0;{message}\007')
|
self._send_console_code(progress_state.get_ansi_escape(percent))
|
||||||
|
|
||||||
def save_console_title(self):
|
def save_console_title(self):
|
||||||
if not self.params.get('consoletitle') or self.params.get('simulate'):
|
if not self.params.get('consoletitle') or self.params.get('simulate'):
|
||||||
@ -985,6 +997,7 @@ def restore_console_title(self):
|
|||||||
|
|
||||||
def __enter__(self):
|
def __enter__(self):
|
||||||
self.save_console_title()
|
self.save_console_title()
|
||||||
|
self.to_console_title(progress_state=_ProgressState.INDETERMINATE)
|
||||||
return self
|
return self
|
||||||
|
|
||||||
def save_cookies(self):
|
def save_cookies(self):
|
||||||
@ -993,6 +1006,7 @@ def save_cookies(self):
|
|||||||
|
|
||||||
def __exit__(self, *args):
|
def __exit__(self, *args):
|
||||||
self.restore_console_title()
|
self.restore_console_title()
|
||||||
|
self.to_console_title(progress_state=_ProgressState.HIDDEN)
|
||||||
self.close()
|
self.close()
|
||||||
|
|
||||||
def close(self):
|
def close(self):
|
||||||
@ -4012,15 +4026,6 @@ def print_debug_header(self):
|
|||||||
if not self.params.get('verbose'):
|
if not self.params.get('verbose'):
|
||||||
return
|
return
|
||||||
|
|
||||||
from . import _IN_CLI # Must be delayed import
|
|
||||||
|
|
||||||
# These imports can be slow. So import them only as needed
|
|
||||||
from .extractor.extractors import _LAZY_LOADER
|
|
||||||
from .extractor.extractors import (
|
|
||||||
_PLUGIN_CLASSES as plugin_ies,
|
|
||||||
_PLUGIN_OVERRIDES as plugin_ie_overrides,
|
|
||||||
)
|
|
||||||
|
|
||||||
def get_encoding(stream):
|
def get_encoding(stream):
|
||||||
ret = str(getattr(stream, 'encoding', f'missing ({type(stream).__name__})'))
|
ret = str(getattr(stream, 'encoding', f'missing ({type(stream).__name__})'))
|
||||||
additional_info = []
|
additional_info = []
|
||||||
@ -4059,17 +4064,18 @@ def get_encoding(stream):
|
|||||||
_make_label(ORIGIN, CHANNEL.partition('@')[2] or __version__, __version__),
|
_make_label(ORIGIN, CHANNEL.partition('@')[2] or __version__, __version__),
|
||||||
f'[{RELEASE_GIT_HEAD[:9]}]' if RELEASE_GIT_HEAD else '',
|
f'[{RELEASE_GIT_HEAD[:9]}]' if RELEASE_GIT_HEAD else '',
|
||||||
'' if source == 'unknown' else f'({source})',
|
'' if source == 'unknown' else f'({source})',
|
||||||
'' if _IN_CLI else 'API' if klass == YoutubeDL else f'API:{self.__module__}.{klass.__qualname__}',
|
'' if IN_CLI.value else 'API' if klass == YoutubeDL else f'API:{self.__module__}.{klass.__qualname__}',
|
||||||
delim=' '))
|
delim=' '))
|
||||||
|
|
||||||
if not _IN_CLI:
|
if not IN_CLI.value:
|
||||||
write_debug(f'params: {self.params}')
|
write_debug(f'params: {self.params}')
|
||||||
|
|
||||||
if not _LAZY_LOADER:
|
import_extractors()
|
||||||
if os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
|
lazy_extractors = LAZY_EXTRACTORS.value
|
||||||
write_debug('Lazy loading extractors is forcibly disabled')
|
if lazy_extractors is None:
|
||||||
else:
|
|
||||||
write_debug('Lazy loading extractors is disabled')
|
write_debug('Lazy loading extractors is disabled')
|
||||||
|
elif not lazy_extractors:
|
||||||
|
write_debug('Lazy loading extractors is forcibly disabled')
|
||||||
if self.params['compat_opts']:
|
if self.params['compat_opts']:
|
||||||
write_debug('Compatibility options: {}'.format(', '.join(self.params['compat_opts'])))
|
write_debug('Compatibility options: {}'.format(', '.join(self.params['compat_opts'])))
|
||||||
|
|
||||||
@ -4098,24 +4104,27 @@ def get_encoding(stream):
|
|||||||
|
|
||||||
write_debug(f'Proxy map: {self.proxies}')
|
write_debug(f'Proxy map: {self.proxies}')
|
||||||
write_debug(f'Request Handlers: {", ".join(rh.RH_NAME for rh in self._request_director.handlers.values())}')
|
write_debug(f'Request Handlers: {", ".join(rh.RH_NAME for rh in self._request_director.handlers.values())}')
|
||||||
if os.environ.get('YTDLP_NO_PLUGINS'):
|
|
||||||
write_debug('Plugins are forcibly disabled')
|
|
||||||
return
|
|
||||||
|
|
||||||
for plugin_type, plugins in {'Extractor': plugin_ies, 'Post-Processor': plugin_pps}.items():
|
for plugin_type, plugins in (('Extractor', plugin_ies), ('Post-Processor', plugin_pps)):
|
||||||
display_list = ['{}{}'.format(
|
display_list = [
|
||||||
klass.__name__, '' if klass.__name__ == name else f' as {name}')
|
klass.__name__ if klass.__name__ == name else f'{klass.__name__} as {name}'
|
||||||
for name, klass in plugins.items()]
|
for name, klass in plugins.value.items()]
|
||||||
if plugin_type == 'Extractor':
|
if plugin_type == 'Extractor':
|
||||||
display_list.extend(f'{plugins[-1].IE_NAME.partition("+")[2]} ({parent.__name__})'
|
display_list.extend(f'{plugins[-1].IE_NAME.partition("+")[2]} ({parent.__name__})'
|
||||||
for parent, plugins in plugin_ie_overrides.items())
|
for parent, plugins in plugin_ies_overrides.value.items())
|
||||||
if not display_list:
|
if not display_list:
|
||||||
continue
|
continue
|
||||||
write_debug(f'{plugin_type} Plugins: {", ".join(sorted(display_list))}')
|
write_debug(f'{plugin_type} Plugins: {", ".join(sorted(display_list))}')
|
||||||
|
|
||||||
plugin_dirs = plugin_directories()
|
plugin_dirs_msg = 'none'
|
||||||
if plugin_dirs:
|
if not plugin_dirs.value:
|
||||||
write_debug(f'Plugin directories: {plugin_dirs}')
|
plugin_dirs_msg = 'none (disabled)'
|
||||||
|
else:
|
||||||
|
found_plugin_directories = plugin_directories()
|
||||||
|
if found_plugin_directories:
|
||||||
|
plugin_dirs_msg = ', '.join(found_plugin_directories)
|
||||||
|
|
||||||
|
write_debug(f'Plugin directories: {plugin_dirs_msg}')
|
||||||
|
|
||||||
@functools.cached_property
|
@functools.cached_property
|
||||||
def proxies(self):
|
def proxies(self):
|
||||||
|
@ -19,7 +19,9 @@
|
|||||||
from .extractor import list_extractor_classes
|
from .extractor import list_extractor_classes
|
||||||
from .extractor.adobepass import MSO_INFO
|
from .extractor.adobepass import MSO_INFO
|
||||||
from .networking.impersonate import ImpersonateTarget
|
from .networking.impersonate import ImpersonateTarget
|
||||||
|
from .globals import IN_CLI, plugin_dirs
|
||||||
from .options import parseOpts
|
from .options import parseOpts
|
||||||
|
from .plugins import load_all_plugins as _load_all_plugins
|
||||||
from .postprocessor import (
|
from .postprocessor import (
|
||||||
FFmpegExtractAudioPP,
|
FFmpegExtractAudioPP,
|
||||||
FFmpegMergerPP,
|
FFmpegMergerPP,
|
||||||
@ -33,7 +35,6 @@
|
|||||||
)
|
)
|
||||||
from .update import Updater
|
from .update import Updater
|
||||||
from .utils import (
|
from .utils import (
|
||||||
Config,
|
|
||||||
NO_DEFAULT,
|
NO_DEFAULT,
|
||||||
POSTPROCESS_WHEN,
|
POSTPROCESS_WHEN,
|
||||||
DateRange,
|
DateRange,
|
||||||
@ -66,8 +67,6 @@
|
|||||||
from .utils._utils import _UnsafeExtensionError
|
from .utils._utils import _UnsafeExtensionError
|
||||||
from .YoutubeDL import YoutubeDL
|
from .YoutubeDL import YoutubeDL
|
||||||
|
|
||||||
_IN_CLI = False
|
|
||||||
|
|
||||||
|
|
||||||
def _exit(status=0, *args):
|
def _exit(status=0, *args):
|
||||||
for msg in args:
|
for msg in args:
|
||||||
@ -308,18 +307,20 @@ def parse_sleep_func(expr):
|
|||||||
raise ValueError(f'invalid {key} retry sleep expression {expr!r}')
|
raise ValueError(f'invalid {key} retry sleep expression {expr!r}')
|
||||||
|
|
||||||
# Bytes
|
# Bytes
|
||||||
def validate_bytes(name, value):
|
def validate_bytes(name, value, strict_positive=False):
|
||||||
if value is None:
|
if value is None:
|
||||||
return None
|
return None
|
||||||
numeric_limit = parse_bytes(value)
|
numeric_limit = parse_bytes(value)
|
||||||
validate(numeric_limit is not None, 'rate limit', value)
|
validate(numeric_limit is not None, name, value)
|
||||||
|
if strict_positive:
|
||||||
|
validate_positive(name, numeric_limit, True)
|
||||||
return numeric_limit
|
return numeric_limit
|
||||||
|
|
||||||
opts.ratelimit = validate_bytes('rate limit', opts.ratelimit)
|
opts.ratelimit = validate_bytes('rate limit', opts.ratelimit, True)
|
||||||
opts.throttledratelimit = validate_bytes('throttled rate limit', opts.throttledratelimit)
|
opts.throttledratelimit = validate_bytes('throttled rate limit', opts.throttledratelimit)
|
||||||
opts.min_filesize = validate_bytes('min filesize', opts.min_filesize)
|
opts.min_filesize = validate_bytes('min filesize', opts.min_filesize)
|
||||||
opts.max_filesize = validate_bytes('max filesize', opts.max_filesize)
|
opts.max_filesize = validate_bytes('max filesize', opts.max_filesize)
|
||||||
opts.buffersize = validate_bytes('buffer size', opts.buffersize)
|
opts.buffersize = validate_bytes('buffer size', opts.buffersize, True)
|
||||||
opts.http_chunk_size = validate_bytes('http chunk size', opts.http_chunk_size)
|
opts.http_chunk_size = validate_bytes('http chunk size', opts.http_chunk_size)
|
||||||
|
|
||||||
# Output templates
|
# Output templates
|
||||||
@ -444,6 +445,10 @@ def metadataparser_actions(f):
|
|||||||
}
|
}
|
||||||
|
|
||||||
# Other options
|
# Other options
|
||||||
|
opts.plugin_dirs = opts.plugin_dirs
|
||||||
|
if opts.plugin_dirs is None:
|
||||||
|
opts.plugin_dirs = ['default']
|
||||||
|
|
||||||
if opts.playlist_items is not None:
|
if opts.playlist_items is not None:
|
||||||
try:
|
try:
|
||||||
tuple(PlaylistEntries.parse_playlist_items(opts.playlist_items))
|
tuple(PlaylistEntries.parse_playlist_items(opts.playlist_items))
|
||||||
@ -984,11 +989,6 @@ def _real_main(argv=None):
|
|||||||
|
|
||||||
parser, opts, all_urls, ydl_opts = parse_options(argv)
|
parser, opts, all_urls, ydl_opts = parse_options(argv)
|
||||||
|
|
||||||
# HACK: Set the plugin dirs early on
|
|
||||||
# TODO(coletdjnz): remove when plugin globals system is implemented
|
|
||||||
if opts.plugin_dirs is not None:
|
|
||||||
Config._plugin_dirs = list(map(expand_path, opts.plugin_dirs))
|
|
||||||
|
|
||||||
# Dump user agent
|
# Dump user agent
|
||||||
if opts.dump_user_agent:
|
if opts.dump_user_agent:
|
||||||
ua = traverse_obj(opts.headers, 'User-Agent', casesense=False, default=std_headers['User-Agent'])
|
ua = traverse_obj(opts.headers, 'User-Agent', casesense=False, default=std_headers['User-Agent'])
|
||||||
@ -1003,6 +1003,11 @@ def _real_main(argv=None):
|
|||||||
if opts.ffmpeg_location:
|
if opts.ffmpeg_location:
|
||||||
FFmpegPostProcessor._ffmpeg_location.set(opts.ffmpeg_location)
|
FFmpegPostProcessor._ffmpeg_location.set(opts.ffmpeg_location)
|
||||||
|
|
||||||
|
# load all plugins into the global lookup
|
||||||
|
plugin_dirs.value = opts.plugin_dirs
|
||||||
|
if plugin_dirs.value:
|
||||||
|
_load_all_plugins()
|
||||||
|
|
||||||
with YoutubeDL(ydl_opts) as ydl:
|
with YoutubeDL(ydl_opts) as ydl:
|
||||||
pre_process = opts.update_self or opts.rm_cachedir
|
pre_process = opts.update_self or opts.rm_cachedir
|
||||||
actual_use = all_urls or opts.load_info_filename
|
actual_use = all_urls or opts.load_info_filename
|
||||||
@ -1102,8 +1107,7 @@ def make_row(target, handler):
|
|||||||
|
|
||||||
|
|
||||||
def main(argv=None):
|
def main(argv=None):
|
||||||
global _IN_CLI
|
IN_CLI.value = True
|
||||||
_IN_CLI = True
|
|
||||||
try:
|
try:
|
||||||
_exit(*variadic(_real_main(argv)))
|
_exit(*variadic(_real_main(argv)))
|
||||||
except (CookieLoadError, DownloadError):
|
except (CookieLoadError, DownloadError):
|
||||||
|
@ -35,6 +35,7 @@ def get_suitable_downloader(info_dict, params={}, default=NO_DEFAULT, protocol=N
|
|||||||
from .rtsp import RtspFD
|
from .rtsp import RtspFD
|
||||||
from .websocket import WebSocketFragmentFD
|
from .websocket import WebSocketFragmentFD
|
||||||
from .youtube_live_chat import YoutubeLiveChatFD
|
from .youtube_live_chat import YoutubeLiveChatFD
|
||||||
|
from .bunnycdn import BunnyCdnFD
|
||||||
|
|
||||||
PROTOCOL_MAP = {
|
PROTOCOL_MAP = {
|
||||||
'rtmp': RtmpFD,
|
'rtmp': RtmpFD,
|
||||||
@ -55,6 +56,7 @@ def get_suitable_downloader(info_dict, params={}, default=NO_DEFAULT, protocol=N
|
|||||||
'websocket_frag': WebSocketFragmentFD,
|
'websocket_frag': WebSocketFragmentFD,
|
||||||
'youtube_live_chat': YoutubeLiveChatFD,
|
'youtube_live_chat': YoutubeLiveChatFD,
|
||||||
'youtube_live_chat_replay': YoutubeLiveChatFD,
|
'youtube_live_chat_replay': YoutubeLiveChatFD,
|
||||||
|
'bunnycdn': BunnyCdnFD,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
50
yt_dlp/downloader/bunnycdn.py
Normal file
50
yt_dlp/downloader/bunnycdn.py
Normal file
@ -0,0 +1,50 @@
|
|||||||
|
import hashlib
|
||||||
|
import random
|
||||||
|
import threading
|
||||||
|
|
||||||
|
from .common import FileDownloader
|
||||||
|
from . import HlsFD
|
||||||
|
from ..networking import Request
|
||||||
|
from ..networking.exceptions import network_exceptions
|
||||||
|
|
||||||
|
|
||||||
|
class BunnyCdnFD(FileDownloader):
|
||||||
|
"""
|
||||||
|
Downloads from BunnyCDN with required pings
|
||||||
|
Note, this is not a part of public API, and will be removed without notice.
|
||||||
|
DO NOT USE
|
||||||
|
"""
|
||||||
|
|
||||||
|
def real_download(self, filename, info_dict):
|
||||||
|
self.to_screen(f'[{self.FD_NAME}] Downloading from BunnyCDN')
|
||||||
|
|
||||||
|
fd = HlsFD(self.ydl, self.params)
|
||||||
|
|
||||||
|
stop_event = threading.Event()
|
||||||
|
ping_thread = threading.Thread(target=self.ping_thread, args=(stop_event,), kwargs=info_dict['_bunnycdn_ping_data'])
|
||||||
|
ping_thread.start()
|
||||||
|
|
||||||
|
try:
|
||||||
|
return fd.real_download(filename, info_dict)
|
||||||
|
finally:
|
||||||
|
stop_event.set()
|
||||||
|
|
||||||
|
def ping_thread(self, stop_event, url, headers, secret, context_id):
|
||||||
|
# Site sends ping every 4 seconds, but this throttles the download. Pinging every 2 seconds seems to work.
|
||||||
|
ping_interval = 2
|
||||||
|
# Hard coded resolution as it doesn't seem to matter
|
||||||
|
res = 1080
|
||||||
|
paused = 'false'
|
||||||
|
current_time = 0
|
||||||
|
|
||||||
|
while not stop_event.wait(ping_interval):
|
||||||
|
current_time += ping_interval
|
||||||
|
|
||||||
|
time = current_time + round(random.random(), 6)
|
||||||
|
md5_hash = hashlib.md5(f'{secret}_{context_id}_{time}_{paused}_{res}'.encode()).hexdigest()
|
||||||
|
ping_url = f'{url}?hash={md5_hash}&time={time}&paused={paused}&resolution={res}'
|
||||||
|
|
||||||
|
try:
|
||||||
|
self.ydl.urlopen(Request(ping_url, headers=headers)).read()
|
||||||
|
except network_exceptions as e:
|
||||||
|
self.to_screen(f'[{self.FD_NAME}] Ping failed: {e}')
|
@ -31,6 +31,7 @@
|
|||||||
timetuple_from_msec,
|
timetuple_from_msec,
|
||||||
try_call,
|
try_call,
|
||||||
)
|
)
|
||||||
|
from ..utils._utils import _ProgressState
|
||||||
|
|
||||||
|
|
||||||
class FileDownloader:
|
class FileDownloader:
|
||||||
@ -333,7 +334,7 @@ def _report_progress_status(self, s, default_template):
|
|||||||
progress_dict), s.get('progress_idx') or 0)
|
progress_dict), s.get('progress_idx') or 0)
|
||||||
self.to_console_title(self.ydl.evaluate_outtmpl(
|
self.to_console_title(self.ydl.evaluate_outtmpl(
|
||||||
progress_template.get('download-title') or 'yt-dlp %(progress._default_template)s',
|
progress_template.get('download-title') or 'yt-dlp %(progress._default_template)s',
|
||||||
progress_dict))
|
progress_dict), _ProgressState.from_dict(s), s.get('_percent'))
|
||||||
|
|
||||||
def _format_progress(self, *args, **kwargs):
|
def _format_progress(self, *args, **kwargs):
|
||||||
return self.ydl._format_text(
|
return self.ydl._format_text(
|
||||||
@ -357,6 +358,7 @@ def with_fields(*tups, default=''):
|
|||||||
'_speed_str': self.format_speed(speed).strip(),
|
'_speed_str': self.format_speed(speed).strip(),
|
||||||
'_total_bytes_str': _format_bytes('total_bytes'),
|
'_total_bytes_str': _format_bytes('total_bytes'),
|
||||||
'_elapsed_str': self.format_seconds(s.get('elapsed')),
|
'_elapsed_str': self.format_seconds(s.get('elapsed')),
|
||||||
|
'_percent': 100.0,
|
||||||
'_percent_str': self.format_percent(100),
|
'_percent_str': self.format_percent(100),
|
||||||
})
|
})
|
||||||
self._report_progress_status(s, join_nonempty(
|
self._report_progress_status(s, join_nonempty(
|
||||||
@ -375,13 +377,15 @@ def with_fields(*tups, default=''):
|
|||||||
return
|
return
|
||||||
self._progress_delta_time += update_delta
|
self._progress_delta_time += update_delta
|
||||||
|
|
||||||
|
progress = try_call(
|
||||||
|
lambda: 100 * s['downloaded_bytes'] / s['total_bytes'],
|
||||||
|
lambda: 100 * s['downloaded_bytes'] / s['total_bytes_estimate'],
|
||||||
|
lambda: s['downloaded_bytes'] == 0 and 0)
|
||||||
s.update({
|
s.update({
|
||||||
'_eta_str': self.format_eta(s.get('eta')).strip(),
|
'_eta_str': self.format_eta(s.get('eta')).strip(),
|
||||||
'_speed_str': self.format_speed(s.get('speed')),
|
'_speed_str': self.format_speed(s.get('speed')),
|
||||||
'_percent_str': self.format_percent(try_call(
|
'_percent': progress,
|
||||||
lambda: 100 * s['downloaded_bytes'] / s['total_bytes'],
|
'_percent_str': self.format_percent(progress),
|
||||||
lambda: 100 * s['downloaded_bytes'] / s['total_bytes_estimate'],
|
|
||||||
lambda: s['downloaded_bytes'] == 0 and 0)),
|
|
||||||
'_total_bytes_str': _format_bytes('total_bytes'),
|
'_total_bytes_str': _format_bytes('total_bytes'),
|
||||||
'_total_bytes_estimate_str': _format_bytes('total_bytes_estimate'),
|
'_total_bytes_estimate_str': _format_bytes('total_bytes_estimate'),
|
||||||
'_downloaded_bytes_str': _format_bytes('downloaded_bytes'),
|
'_downloaded_bytes_str': _format_bytes('downloaded_bytes'),
|
||||||
|
@ -457,8 +457,6 @@ class FFmpegFD(ExternalFD):
|
|||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def available(cls, path=None):
|
def available(cls, path=None):
|
||||||
# TODO: Fix path for ffmpeg
|
|
||||||
# Fixme: This may be wrong when --ffmpeg-location is used
|
|
||||||
return FFmpegPostProcessor().available
|
return FFmpegPostProcessor().available
|
||||||
|
|
||||||
def on_process_started(self, proc, stdin):
|
def on_process_started(self, proc, stdin):
|
||||||
|
@ -16,6 +16,7 @@
|
|||||||
update_url_query,
|
update_url_query,
|
||||||
urljoin,
|
urljoin,
|
||||||
)
|
)
|
||||||
|
from ..utils._utils import _request_dump_filename
|
||||||
|
|
||||||
|
|
||||||
class HlsFD(FragmentFD):
|
class HlsFD(FragmentFD):
|
||||||
@ -72,11 +73,23 @@ def check_results():
|
|||||||
|
|
||||||
def real_download(self, filename, info_dict):
|
def real_download(self, filename, info_dict):
|
||||||
man_url = info_dict['url']
|
man_url = info_dict['url']
|
||||||
self.to_screen(f'[{self.FD_NAME}] Downloading m3u8 manifest')
|
|
||||||
|
|
||||||
|
s = info_dict.get('hls_media_playlist_data')
|
||||||
|
if s:
|
||||||
|
self.to_screen(f'[{self.FD_NAME}] Using m3u8 manifest from extracted info')
|
||||||
|
else:
|
||||||
|
self.to_screen(f'[{self.FD_NAME}] Downloading m3u8 manifest')
|
||||||
urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
|
urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
|
||||||
man_url = urlh.url
|
man_url = urlh.url
|
||||||
s = urlh.read().decode('utf-8', 'ignore')
|
s_bytes = urlh.read()
|
||||||
|
if self.params.get('write_pages'):
|
||||||
|
dump_filename = _request_dump_filename(
|
||||||
|
man_url, info_dict['id'], None,
|
||||||
|
trim_length=self.params.get('trim_file_name'))
|
||||||
|
self.to_screen(f'[{self.FD_NAME}] Saving request to {dump_filename}')
|
||||||
|
with open(dump_filename, 'wb') as outf:
|
||||||
|
outf.write(s_bytes)
|
||||||
|
s = s_bytes.decode('utf-8', 'ignore')
|
||||||
|
|
||||||
can_download, message = self.can_download(s, info_dict, self.params.get('allow_unplayable_formats')), None
|
can_download, message = self.can_download(s, info_dict, self.params.get('allow_unplayable_formats')), None
|
||||||
if can_download:
|
if can_download:
|
||||||
@ -177,6 +190,7 @@ def is_ad_fragment_end(s):
|
|||||||
if external_aes_iv:
|
if external_aes_iv:
|
||||||
external_aes_iv = binascii.unhexlify(remove_start(external_aes_iv, '0x').zfill(32))
|
external_aes_iv = binascii.unhexlify(remove_start(external_aes_iv, '0x').zfill(32))
|
||||||
byte_range = {}
|
byte_range = {}
|
||||||
|
byte_range_offset = 0
|
||||||
discontinuity_count = 0
|
discontinuity_count = 0
|
||||||
frag_index = 0
|
frag_index = 0
|
||||||
ad_frag_next = False
|
ad_frag_next = False
|
||||||
@ -204,6 +218,11 @@ def is_ad_fragment_end(s):
|
|||||||
})
|
})
|
||||||
media_sequence += 1
|
media_sequence += 1
|
||||||
|
|
||||||
|
# If the byte_range is truthy, reset it after appending a fragment that uses it
|
||||||
|
if byte_range:
|
||||||
|
byte_range_offset = byte_range['end']
|
||||||
|
byte_range = {}
|
||||||
|
|
||||||
elif line.startswith('#EXT-X-MAP'):
|
elif line.startswith('#EXT-X-MAP'):
|
||||||
if format_index and discontinuity_count != format_index:
|
if format_index and discontinuity_count != format_index:
|
||||||
continue
|
continue
|
||||||
@ -217,10 +236,12 @@ def is_ad_fragment_end(s):
|
|||||||
if extra_segment_query:
|
if extra_segment_query:
|
||||||
frag_url = update_url_query(frag_url, extra_segment_query)
|
frag_url = update_url_query(frag_url, extra_segment_query)
|
||||||
|
|
||||||
|
map_byte_range = {}
|
||||||
|
|
||||||
if map_info.get('BYTERANGE'):
|
if map_info.get('BYTERANGE'):
|
||||||
splitted_byte_range = map_info.get('BYTERANGE').split('@')
|
splitted_byte_range = map_info.get('BYTERANGE').split('@')
|
||||||
sub_range_start = int(splitted_byte_range[1]) if len(splitted_byte_range) == 2 else byte_range['end']
|
sub_range_start = int(splitted_byte_range[1]) if len(splitted_byte_range) == 2 else 0
|
||||||
byte_range = {
|
map_byte_range = {
|
||||||
'start': sub_range_start,
|
'start': sub_range_start,
|
||||||
'end': sub_range_start + int(splitted_byte_range[0]),
|
'end': sub_range_start + int(splitted_byte_range[0]),
|
||||||
}
|
}
|
||||||
@ -229,7 +250,7 @@ def is_ad_fragment_end(s):
|
|||||||
'frag_index': frag_index,
|
'frag_index': frag_index,
|
||||||
'url': frag_url,
|
'url': frag_url,
|
||||||
'decrypt_info': decrypt_info,
|
'decrypt_info': decrypt_info,
|
||||||
'byte_range': byte_range,
|
'byte_range': map_byte_range,
|
||||||
'media_sequence': media_sequence,
|
'media_sequence': media_sequence,
|
||||||
})
|
})
|
||||||
media_sequence += 1
|
media_sequence += 1
|
||||||
@ -257,7 +278,7 @@ def is_ad_fragment_end(s):
|
|||||||
media_sequence = int(line[22:])
|
media_sequence = int(line[22:])
|
||||||
elif line.startswith('#EXT-X-BYTERANGE'):
|
elif line.startswith('#EXT-X-BYTERANGE'):
|
||||||
splitted_byte_range = line[17:].split('@')
|
splitted_byte_range = line[17:].split('@')
|
||||||
sub_range_start = int(splitted_byte_range[1]) if len(splitted_byte_range) == 2 else byte_range['end']
|
sub_range_start = int(splitted_byte_range[1]) if len(splitted_byte_range) == 2 else byte_range_offset
|
||||||
byte_range = {
|
byte_range = {
|
||||||
'start': sub_range_start,
|
'start': sub_range_start,
|
||||||
'end': sub_range_start + int(splitted_byte_range[0]),
|
'end': sub_range_start + int(splitted_byte_range[0]),
|
||||||
|
@ -1,16 +1,25 @@
|
|||||||
from ..compat.compat_utils import passthrough_module
|
from ..compat.compat_utils import passthrough_module
|
||||||
|
from ..globals import extractors as _extractors_context
|
||||||
|
from ..globals import plugin_ies as _plugin_ies_context
|
||||||
|
from ..plugins import PluginSpec, register_plugin_spec
|
||||||
|
|
||||||
passthrough_module(__name__, '.extractors')
|
passthrough_module(__name__, '.extractors')
|
||||||
del passthrough_module
|
del passthrough_module
|
||||||
|
|
||||||
|
register_plugin_spec(PluginSpec(
|
||||||
|
module_name='extractor',
|
||||||
|
suffix='IE',
|
||||||
|
destination=_extractors_context,
|
||||||
|
plugin_destination=_plugin_ies_context,
|
||||||
|
))
|
||||||
|
|
||||||
|
|
||||||
def gen_extractor_classes():
|
def gen_extractor_classes():
|
||||||
""" Return a list of supported extractors.
|
""" Return a list of supported extractors.
|
||||||
The order does matter; the first extractor matched is the one handling the URL.
|
The order does matter; the first extractor matched is the one handling the URL.
|
||||||
"""
|
"""
|
||||||
from .extractors import _ALL_CLASSES
|
import_extractors()
|
||||||
|
return list(_extractors_context.value.values())
|
||||||
return _ALL_CLASSES
|
|
||||||
|
|
||||||
|
|
||||||
def gen_extractors():
|
def gen_extractors():
|
||||||
@ -37,6 +46,9 @@ def list_extractors(age_limit=None):
|
|||||||
|
|
||||||
def get_info_extractor(ie_name):
|
def get_info_extractor(ie_name):
|
||||||
"""Returns the info extractor class with the given ie_name"""
|
"""Returns the info extractor class with the given ie_name"""
|
||||||
from . import extractors
|
import_extractors()
|
||||||
|
return _extractors_context.value[f'{ie_name}IE']
|
||||||
|
|
||||||
return getattr(extractors, f'{ie_name}IE')
|
|
||||||
|
def import_extractors():
|
||||||
|
from . import extractors # noqa: F401
|
||||||
|
@ -312,6 +312,7 @@
|
|||||||
)
|
)
|
||||||
from .bundesliga import BundesligaIE
|
from .bundesliga import BundesligaIE
|
||||||
from .bundestag import BundestagIE
|
from .bundestag import BundestagIE
|
||||||
|
from .bunnycdn import BunnyCdnIE
|
||||||
from .businessinsider import BusinessInsiderIE
|
from .businessinsider import BusinessInsiderIE
|
||||||
from .buzzfeed import BuzzFeedIE
|
from .buzzfeed import BuzzFeedIE
|
||||||
from .byutv import BYUtvIE
|
from .byutv import BYUtvIE
|
||||||
@ -508,6 +509,7 @@
|
|||||||
from .dhm import DHMIE
|
from .dhm import DHMIE
|
||||||
from .digitalconcerthall import DigitalConcertHallIE
|
from .digitalconcerthall import DigitalConcertHallIE
|
||||||
from .digiteka import DigitekaIE
|
from .digiteka import DigitekaIE
|
||||||
|
from .digiview import DigiviewIE
|
||||||
from .discogs import DiscogsReleasePlaylistIE
|
from .discogs import DiscogsReleasePlaylistIE
|
||||||
from .disney import DisneyIE
|
from .disney import DisneyIE
|
||||||
from .dispeak import DigitallySpeakingIE
|
from .dispeak import DigitallySpeakingIE
|
||||||
@ -1893,6 +1895,7 @@
|
|||||||
from .smotrim import SmotrimIE
|
from .smotrim import SmotrimIE
|
||||||
from .snapchat import SnapchatSpotlightIE
|
from .snapchat import SnapchatSpotlightIE
|
||||||
from .snotr import SnotrIE
|
from .snotr import SnotrIE
|
||||||
|
from .softwhiteunderbelly import SoftWhiteUnderbellyIE
|
||||||
from .sohu import (
|
from .sohu import (
|
||||||
SohuIE,
|
SohuIE,
|
||||||
SohuVIE,
|
SohuVIE,
|
||||||
@ -2221,6 +2224,7 @@
|
|||||||
TVPlayIE,
|
TVPlayIE,
|
||||||
)
|
)
|
||||||
from .tvplayer import TVPlayerIE
|
from .tvplayer import TVPlayerIE
|
||||||
|
from .tvw import TvwIE
|
||||||
from .tweakers import TweakersIE
|
from .tweakers import TweakersIE
|
||||||
from .twentymin import TwentyMinutenIE
|
from .twentymin import TwentyMinutenIE
|
||||||
from .twentythreevideo import TwentyThreeVideoIE
|
from .twentythreevideo import TwentyThreeVideoIE
|
||||||
|
@ -1,7 +1,6 @@
|
|||||||
import json
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .kaltura import KalturaIE
|
from .kaltura import KalturaIE
|
||||||
|
from ..utils.traversal import require, traverse_obj
|
||||||
|
|
||||||
|
|
||||||
class AZMedienIE(InfoExtractor):
|
class AZMedienIE(InfoExtractor):
|
||||||
@ -9,15 +8,15 @@ class AZMedienIE(InfoExtractor):
|
|||||||
_VALID_URL = r'''(?x)
|
_VALID_URL = r'''(?x)
|
||||||
https?://
|
https?://
|
||||||
(?:www\.|tv\.)?
|
(?:www\.|tv\.)?
|
||||||
(?P<host>
|
(?:
|
||||||
telezueri\.ch|
|
telezueri\.ch|
|
||||||
telebaern\.tv|
|
telebaern\.tv|
|
||||||
telem1\.ch|
|
telem1\.ch|
|
||||||
tvo-online\.ch
|
tvo-online\.ch
|
||||||
)/
|
)/
|
||||||
[^/]+/
|
[^/?#]+/
|
||||||
(?P<id>
|
(?P<id>
|
||||||
[^/]+-(?P<article_id>\d+)
|
[^/?#]+-\d+
|
||||||
)
|
)
|
||||||
(?:
|
(?:
|
||||||
\#video=
|
\#video=
|
||||||
@ -47,19 +46,17 @@ class AZMedienIE(InfoExtractor):
|
|||||||
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
|
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/a4016f65fe62b81dc6664dd9f4910e4ab40383be'
|
|
||||||
_PARTNER_ID = '1719221'
|
_PARTNER_ID = '1719221'
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
host, display_id, article_id, entry_id = self._match_valid_url(url).groups()
|
display_id, entry_id = self._match_valid_url(url).groups()
|
||||||
|
|
||||||
if not entry_id:
|
if not entry_id:
|
||||||
entry_id = self._download_json(
|
webpage = self._download_webpage(url, display_id)
|
||||||
self._API_TEMPL % (host, host.split('.')[0]), display_id, query={
|
data = self._search_json(
|
||||||
'variables': json.dumps({
|
r'window\.__APOLLO_STATE__\s*=', webpage, 'video data', display_id)
|
||||||
'contextId': 'NewsArticle:' + article_id,
|
entry_id = traverse_obj(data, (
|
||||||
}),
|
lambda _, v: v['__typename'] == 'KalturaData', 'kalturaId', any, {require('kaltura id')}))
|
||||||
})['data']['context']['mainAsset']['video']['kaltura']['kalturaId']
|
|
||||||
|
|
||||||
return self.url_result(
|
return self.url_result(
|
||||||
f'kaltura:{self._PARTNER_ID}:{entry_id}',
|
f'kaltura:{self._PARTNER_ID}:{entry_id}',
|
||||||
|
178
yt_dlp/extractor/bunnycdn.py
Normal file
178
yt_dlp/extractor/bunnycdn.py
Normal file
@ -0,0 +1,178 @@
|
|||||||
|
import json
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..networking import HEADRequest
|
||||||
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
|
extract_attributes,
|
||||||
|
int_or_none,
|
||||||
|
parse_qs,
|
||||||
|
smuggle_url,
|
||||||
|
unsmuggle_url,
|
||||||
|
url_or_none,
|
||||||
|
urlhandle_detect_ext,
|
||||||
|
)
|
||||||
|
from ..utils.traversal import find_element, traverse_obj
|
||||||
|
|
||||||
|
|
||||||
|
class BunnyCdnIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:iframe\.mediadelivery\.net|video\.bunnycdn\.com)/(?:embed|play)/(?P<library_id>\d+)/(?P<id>[\da-f-]+)'
|
||||||
|
_EMBED_REGEX = [rf'<iframe[^>]+src=[\'"](?P<url>{_VALID_URL}[^\'"]*)[\'"]']
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://iframe.mediadelivery.net/embed/113933/e73edec1-e381-4c8b-ae73-717a140e0924',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'e73edec1-e381-4c8b-ae73-717a140e0924',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'mistress morgana (3).mp4',
|
||||||
|
'description': '',
|
||||||
|
'timestamp': 1693251673,
|
||||||
|
'thumbnail': r're:^https?://.*\.b-cdn\.net/e73edec1-e381-4c8b-ae73-717a140e0924/thumbnail\.jpg',
|
||||||
|
'duration': 7.0,
|
||||||
|
'upload_date': '20230828',
|
||||||
|
},
|
||||||
|
'params': {'skip_download': True},
|
||||||
|
}, {
|
||||||
|
'url': 'https://iframe.mediadelivery.net/play/136145/32e34c4b-0d72-437c-9abb-05e67657da34',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '32e34c4b-0d72-437c-9abb-05e67657da34',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'timestamp': 1691145748,
|
||||||
|
'thumbnail': r're:^https?://.*\.b-cdn\.net/32e34c4b-0d72-437c-9abb-05e67657da34/thumbnail_9172dc16\.jpg',
|
||||||
|
'duration': 106.0,
|
||||||
|
'description': 'md5:981a3e899a5c78352b21ed8b2f1efd81',
|
||||||
|
'upload_date': '20230804',
|
||||||
|
'title': 'Sanela ist Teil der #arbeitsmarktkraft',
|
||||||
|
},
|
||||||
|
'params': {'skip_download': True},
|
||||||
|
}, {
|
||||||
|
# Stream requires activation and pings
|
||||||
|
'url': 'https://iframe.mediadelivery.net/embed/200867/2e8545ec-509d-4571-b855-4cf0235ccd75',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '2e8545ec-509d-4571-b855-4cf0235ccd75',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'timestamp': 1708497752,
|
||||||
|
'title': 'netflix part 1',
|
||||||
|
'duration': 3959.0,
|
||||||
|
'description': '',
|
||||||
|
'upload_date': '20240221',
|
||||||
|
'thumbnail': r're:^https?://.*\.b-cdn\.net/2e8545ec-509d-4571-b855-4cf0235ccd75/thumbnail\.jpg',
|
||||||
|
},
|
||||||
|
'params': {'skip_download': True},
|
||||||
|
}]
|
||||||
|
_WEBPAGE_TESTS = [{
|
||||||
|
# Stream requires Referer
|
||||||
|
'url': 'https://conword.io/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3a5d863e-9cd6-447e-b6ef-e289af50b349',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Conword bei der Stadt Köln und Stadt Dortmund',
|
||||||
|
'description': '',
|
||||||
|
'upload_date': '20231031',
|
||||||
|
'duration': 31.0,
|
||||||
|
'thumbnail': 'https://video.watchuh.com/3a5d863e-9cd6-447e-b6ef-e289af50b349/thumbnail.jpg',
|
||||||
|
'timestamp': 1698783879,
|
||||||
|
},
|
||||||
|
'params': {'skip_download': True},
|
||||||
|
}, {
|
||||||
|
# URL requires token and expires
|
||||||
|
'url': 'https://www.stockphotos.com/video/moscow-subway-the-train-is-arriving-at-the-park-kultury-station-10017830',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '0b02fa20-4e8c-4140-8f87-f64d820a3386',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'thumbnail': r're:^https?://.*\.b-cdn\.net/0b02fa20-4e8c-4140-8f87-f64d820a3386/thumbnail\.jpg',
|
||||||
|
'title': 'Moscow subway. The train is arriving at the Park Kultury station.',
|
||||||
|
'upload_date': '20240531',
|
||||||
|
'duration': 18.0,
|
||||||
|
'timestamp': 1717152269,
|
||||||
|
'description': '',
|
||||||
|
},
|
||||||
|
'params': {'skip_download': True},
|
||||||
|
}]
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def _extract_embed_urls(cls, url, webpage):
|
||||||
|
for embed_url in super()._extract_embed_urls(url, webpage):
|
||||||
|
yield smuggle_url(embed_url, {'Referer': url})
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
url, smuggled_data = unsmuggle_url(url, {})
|
||||||
|
|
||||||
|
video_id, library_id = self._match_valid_url(url).group('id', 'library_id')
|
||||||
|
webpage = self._download_webpage(
|
||||||
|
f'https://iframe.mediadelivery.net/embed/{library_id}/{video_id}', video_id,
|
||||||
|
headers=traverse_obj(smuggled_data, {'Referer': 'Referer'}),
|
||||||
|
query=traverse_obj(parse_qs(url), {'token': 'token', 'expires': 'expires'}))
|
||||||
|
|
||||||
|
if html_title := self._html_extract_title(webpage, default=None) == '403':
|
||||||
|
raise ExtractorError(
|
||||||
|
'This video is inaccessible. Setting a Referer header '
|
||||||
|
'might be required to access the video', expected=True)
|
||||||
|
elif html_title == '404':
|
||||||
|
raise ExtractorError('This video does not exist', expected=True)
|
||||||
|
|
||||||
|
headers = {'Referer': url}
|
||||||
|
|
||||||
|
info = traverse_obj(self._parse_html5_media_entries(url, webpage, video_id, _headers=headers), 0) or {}
|
||||||
|
formats = info.get('formats') or []
|
||||||
|
subtitles = info.get('subtitles') or {}
|
||||||
|
|
||||||
|
original_url = self._search_regex(
|
||||||
|
r'(?:var|const|let)\s+originalUrl\s*=\s*["\']([^"\']+)["\']', webpage, 'original url', default=None)
|
||||||
|
if url_or_none(original_url):
|
||||||
|
urlh = self._request_webpage(
|
||||||
|
HEADRequest(original_url), video_id=video_id, note='Checking original',
|
||||||
|
headers=headers, fatal=False, expected_status=(403, 404))
|
||||||
|
if urlh and urlh.status == 200:
|
||||||
|
formats.append({
|
||||||
|
'url': original_url,
|
||||||
|
'format_id': 'source',
|
||||||
|
'quality': 1,
|
||||||
|
'http_headers': headers,
|
||||||
|
'ext': urlhandle_detect_ext(urlh, default='mp4'),
|
||||||
|
'filesize': int_or_none(urlh.get_header('Content-Length')),
|
||||||
|
})
|
||||||
|
|
||||||
|
# MediaCage Streams require activation and pings
|
||||||
|
src_url = self._search_regex(
|
||||||
|
r'\.setAttribute\([\'"]src[\'"],\s*[\'"]([^\'"]+)[\'"]\)', webpage, 'src url', default=None)
|
||||||
|
activation_url = self._search_regex(
|
||||||
|
r'loadUrl\([\'"]([^\'"]+/activate)[\'"]', webpage, 'activation url', default=None)
|
||||||
|
ping_url = self._search_regex(
|
||||||
|
r'loadUrl\([\'"]([^\'"]+/ping)[\'"]', webpage, 'ping url', default=None)
|
||||||
|
secret = traverse_obj(parse_qs(src_url), ('secret', 0))
|
||||||
|
context_id = traverse_obj(parse_qs(src_url), ('contextId', 0))
|
||||||
|
ping_data = {}
|
||||||
|
if src_url and activation_url and ping_url and secret and context_id:
|
||||||
|
self._download_webpage(
|
||||||
|
activation_url, video_id, headers=headers, note='Downloading activation data')
|
||||||
|
|
||||||
|
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||||
|
src_url, video_id, 'mp4', headers=headers, m3u8_id='hls', fatal=False)
|
||||||
|
for fmt in fmts:
|
||||||
|
fmt.update({
|
||||||
|
'protocol': 'bunnycdn',
|
||||||
|
'http_headers': headers,
|
||||||
|
})
|
||||||
|
formats.extend(fmts)
|
||||||
|
self._merge_subtitles(subs, target=subtitles)
|
||||||
|
|
||||||
|
ping_data = {
|
||||||
|
'_bunnycdn_ping_data': {
|
||||||
|
'url': ping_url,
|
||||||
|
'headers': headers,
|
||||||
|
'secret': secret,
|
||||||
|
'context_id': context_id,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'formats': formats,
|
||||||
|
'subtitles': subtitles,
|
||||||
|
**traverse_obj(webpage, ({find_element(id='main-video', html=True)}, {extract_attributes}, {
|
||||||
|
'title': ('data-plyr-config', {json.loads}, 'title', {str}),
|
||||||
|
'thumbnail': ('data-poster', {url_or_none}),
|
||||||
|
})),
|
||||||
|
**ping_data,
|
||||||
|
**self._search_json_ld(webpage, video_id, fatal=False),
|
||||||
|
}
|
@ -1,29 +1,32 @@
|
|||||||
import base64
|
|
||||||
import functools
|
import functools
|
||||||
import json
|
|
||||||
import re
|
import re
|
||||||
import time
|
import time
|
||||||
import urllib.parse
|
import urllib.parse
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..networking import HEADRequest
|
from ..networking import HEADRequest
|
||||||
|
from ..networking.exceptions import HTTPError
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
js_to_json,
|
js_to_json,
|
||||||
|
jwt_decode_hs256,
|
||||||
mimetype2ext,
|
mimetype2ext,
|
||||||
orderedSet,
|
orderedSet,
|
||||||
|
parse_age_limit,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
replace_extension,
|
replace_extension,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
strip_or_none,
|
strip_or_none,
|
||||||
traverse_obj,
|
|
||||||
try_get,
|
try_get,
|
||||||
|
unified_timestamp,
|
||||||
update_url,
|
update_url,
|
||||||
url_basename,
|
url_basename,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
|
urlencode_postdata,
|
||||||
)
|
)
|
||||||
|
from ..utils.traversal import require, traverse_obj, trim_str
|
||||||
|
|
||||||
|
|
||||||
class CBCIE(InfoExtractor):
|
class CBCIE(InfoExtractor):
|
||||||
@ -516,9 +519,43 @@ def entries():
|
|||||||
return self.playlist_result(entries(), playlist_id)
|
return self.playlist_result(entries(), playlist_id)
|
||||||
|
|
||||||
|
|
||||||
class CBCGemIE(InfoExtractor):
|
class CBCGemBaseIE(InfoExtractor):
|
||||||
|
_NETRC_MACHINE = 'cbcgem'
|
||||||
|
_GEO_COUNTRIES = ['CA']
|
||||||
|
|
||||||
|
def _call_show_api(self, item_id, display_id=None):
|
||||||
|
return self._download_json(
|
||||||
|
f'https://services.radio-canada.ca/ott/catalog/v2/gem/show/{item_id}',
|
||||||
|
display_id or item_id, query={'device': 'web'})
|
||||||
|
|
||||||
|
def _extract_item_info(self, item_info):
|
||||||
|
episode_number = None
|
||||||
|
title = traverse_obj(item_info, ('title', {str}))
|
||||||
|
if title and (mobj := re.match(r'(?P<episode>\d+)\. (?P<title>.+)', title)):
|
||||||
|
episode_number = int_or_none(mobj.group('episode'))
|
||||||
|
title = mobj.group('title')
|
||||||
|
|
||||||
|
return {
|
||||||
|
'episode_number': episode_number,
|
||||||
|
**traverse_obj(item_info, {
|
||||||
|
'id': ('url', {str}),
|
||||||
|
'episode_id': ('url', {str}),
|
||||||
|
'description': ('description', {str}),
|
||||||
|
'thumbnail': ('images', 'card', 'url', {url_or_none}, {update_url(query=None)}),
|
||||||
|
'episode_number': ('episodeNumber', {int_or_none}),
|
||||||
|
'duration': ('metadata', 'duration', {int_or_none}),
|
||||||
|
'release_timestamp': ('metadata', 'airDate', {unified_timestamp}),
|
||||||
|
'timestamp': ('metadata', 'availabilityDate', {unified_timestamp}),
|
||||||
|
'age_limit': ('metadata', 'rating', {trim_str(start='C')}, {parse_age_limit}),
|
||||||
|
}),
|
||||||
|
'episode': title,
|
||||||
|
'title': title,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class CBCGemIE(CBCGemBaseIE):
|
||||||
IE_NAME = 'gem.cbc.ca'
|
IE_NAME = 'gem.cbc.ca'
|
||||||
_VALID_URL = r'https?://gem\.cbc\.ca/(?:media/)?(?P<id>[0-9a-z-]+/s[0-9]+[a-z][0-9]+)'
|
_VALID_URL = r'https?://gem\.cbc\.ca/(?:media/)?(?P<id>[0-9a-z-]+/s(?P<season>[0-9]+)[a-z][0-9]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# This is a normal, public, TV show video
|
# This is a normal, public, TV show video
|
||||||
'url': 'https://gem.cbc.ca/media/schitts-creek/s06e01',
|
'url': 'https://gem.cbc.ca/media/schitts-creek/s06e01',
|
||||||
@ -529,7 +566,7 @@ class CBCGemIE(InfoExtractor):
|
|||||||
'description': 'md5:929868d20021c924020641769eb3e7f1',
|
'description': 'md5:929868d20021c924020641769eb3e7f1',
|
||||||
'thumbnail': r're:https://images\.radio-canada\.ca/[^#?]+/cbc_schitts_creek_season_06e01_thumbnail_v01\.jpg',
|
'thumbnail': r're:https://images\.radio-canada\.ca/[^#?]+/cbc_schitts_creek_season_06e01_thumbnail_v01\.jpg',
|
||||||
'duration': 1324,
|
'duration': 1324,
|
||||||
'categories': ['comedy'],
|
'genres': ['Comédie et humour'],
|
||||||
'series': 'Schitt\'s Creek',
|
'series': 'Schitt\'s Creek',
|
||||||
'season': 'Season 6',
|
'season': 'Season 6',
|
||||||
'season_number': 6,
|
'season_number': 6,
|
||||||
@ -537,9 +574,10 @@ class CBCGemIE(InfoExtractor):
|
|||||||
'episode_number': 1,
|
'episode_number': 1,
|
||||||
'episode_id': 'schitts-creek/s06e01',
|
'episode_id': 'schitts-creek/s06e01',
|
||||||
'upload_date': '20210618',
|
'upload_date': '20210618',
|
||||||
'timestamp': 1623988800,
|
'timestamp': 1623974400,
|
||||||
'release_date': '20200107',
|
'release_date': '20200107',
|
||||||
'release_timestamp': 1578427200,
|
'release_timestamp': 1578355200,
|
||||||
|
'age_limit': 14,
|
||||||
},
|
},
|
||||||
'params': {'format': 'bv'},
|
'params': {'format': 'bv'},
|
||||||
}, {
|
}, {
|
||||||
@ -557,12 +595,13 @@ class CBCGemIE(InfoExtractor):
|
|||||||
'episode_number': 1,
|
'episode_number': 1,
|
||||||
'episode': 'The Cup Runneth Over',
|
'episode': 'The Cup Runneth Over',
|
||||||
'episode_id': 'schitts-creek/s01e01',
|
'episode_id': 'schitts-creek/s01e01',
|
||||||
'duration': 1309,
|
'duration': 1308,
|
||||||
'categories': ['comedy'],
|
'genres': ['Comédie et humour'],
|
||||||
'upload_date': '20210617',
|
'upload_date': '20210617',
|
||||||
'timestamp': 1623902400,
|
'timestamp': 1623888000,
|
||||||
'release_date': '20151124',
|
'release_date': '20151123',
|
||||||
'release_timestamp': 1448323200,
|
'release_timestamp': 1448236800,
|
||||||
|
'age_limit': 14,
|
||||||
},
|
},
|
||||||
'params': {'format': 'bv'},
|
'params': {'format': 'bv'},
|
||||||
}, {
|
}, {
|
||||||
@ -570,82 +609,107 @@ class CBCGemIE(InfoExtractor):
|
|||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
_GEO_COUNTRIES = ['CA']
|
_CLIENT_ID = 'fc05b0ee-3865-4400-a3cc-3da82c330c23'
|
||||||
_TOKEN_API_KEY = '3f4beddd-2061-49b0-ae80-6f1f2ed65b37'
|
_refresh_token = None
|
||||||
_NETRC_MACHINE = 'cbcgem'
|
_access_token = None
|
||||||
_claims_token = None
|
_claims_token = None
|
||||||
|
|
||||||
def _new_claims_token(self, email, password):
|
@functools.cached_property
|
||||||
data = json.dumps({
|
def _ropc_settings(self):
|
||||||
'email': email,
|
return self._download_json(
|
||||||
|
'https://services.radio-canada.ca/ott/catalog/v1/gem/settings', None,
|
||||||
|
'Downloading site settings', query={'device': 'web'})['identityManagement']['ropc']
|
||||||
|
|
||||||
|
def _is_jwt_expired(self, token):
|
||||||
|
return jwt_decode_hs256(token)['exp'] - time.time() < 300
|
||||||
|
|
||||||
|
def _call_oauth_api(self, oauth_data, note='Refreshing access token'):
|
||||||
|
response = self._download_json(
|
||||||
|
self._ropc_settings['url'], None, note, data=urlencode_postdata({
|
||||||
|
'client_id': self._CLIENT_ID,
|
||||||
|
**oauth_data,
|
||||||
|
'scope': self._ropc_settings['scopes'],
|
||||||
|
}))
|
||||||
|
self._refresh_token = response['refresh_token']
|
||||||
|
self._access_token = response['access_token']
|
||||||
|
self.cache.store(self._NETRC_MACHINE, 'token_data', [self._refresh_token, self._access_token])
|
||||||
|
|
||||||
|
def _perform_login(self, username, password):
|
||||||
|
if not self._refresh_token:
|
||||||
|
self._refresh_token, self._access_token = self.cache.load(
|
||||||
|
self._NETRC_MACHINE, 'token_data', default=[None, None])
|
||||||
|
|
||||||
|
if self._refresh_token and self._access_token:
|
||||||
|
self.write_debug('Using cached refresh token')
|
||||||
|
if not self._claims_token:
|
||||||
|
self._claims_token = self.cache.load(self._NETRC_MACHINE, 'claims_token')
|
||||||
|
return
|
||||||
|
|
||||||
|
try:
|
||||||
|
self._call_oauth_api({
|
||||||
|
'grant_type': 'password',
|
||||||
|
'username': username,
|
||||||
'password': password,
|
'password': password,
|
||||||
}).encode()
|
}, note='Logging in')
|
||||||
headers = {'content-type': 'application/json'}
|
except ExtractorError as e:
|
||||||
query = {'apikey': self._TOKEN_API_KEY}
|
if isinstance(e.cause, HTTPError) and e.cause.status == 400:
|
||||||
resp = self._download_json('https://api.loginradius.com/identity/v2/auth/login',
|
raise ExtractorError('Invalid username and/or password', expected=True)
|
||||||
None, data=data, headers=headers, query=query)
|
raise
|
||||||
access_token = resp['access_token']
|
|
||||||
|
|
||||||
query = {
|
def _fetch_access_token(self):
|
||||||
'access_token': access_token,
|
if self._is_jwt_expired(self._access_token):
|
||||||
'apikey': self._TOKEN_API_KEY,
|
try:
|
||||||
'jwtapp': 'jwt',
|
self._call_oauth_api({
|
||||||
}
|
'grant_type': 'refresh_token',
|
||||||
resp = self._download_json('https://cloud-api.loginradius.com/sso/jwt/api/token',
|
'refresh_token': self._refresh_token,
|
||||||
None, headers=headers, query=query)
|
})
|
||||||
sig = resp['signature']
|
except ExtractorError:
|
||||||
|
self._refresh_token, self._access_token = None, None
|
||||||
|
self.cache.store(self._NETRC_MACHINE, 'token_data', [None, None])
|
||||||
|
self.report_warning('Refresh token has been invalidated; retrying with credentials')
|
||||||
|
self._perform_login(*self._get_login_info())
|
||||||
|
|
||||||
data = json.dumps({'jwt': sig}).encode()
|
return self._access_token
|
||||||
headers = {'content-type': 'application/json', 'ott-device-type': 'web'}
|
|
||||||
resp = self._download_json('https://services.radio-canada.ca/ott/cbc-api/v2/token',
|
|
||||||
None, data=data, headers=headers, expected_status=426)
|
|
||||||
cbc_access_token = resp['accessToken']
|
|
||||||
|
|
||||||
headers = {'content-type': 'application/json', 'ott-device-type': 'web', 'ott-access-token': cbc_access_token}
|
def _fetch_claims_token(self):
|
||||||
resp = self._download_json('https://services.radio-canada.ca/ott/cbc-api/v2/profile',
|
if not self._get_login_info()[0]:
|
||||||
None, headers=headers, expected_status=426)
|
return None
|
||||||
return resp['claimsToken']
|
|
||||||
|
|
||||||
def _get_claims_token_expiry(self):
|
if not self._claims_token or self._is_jwt_expired(self._claims_token):
|
||||||
# Token is a JWT
|
self._claims_token = self._download_json(
|
||||||
# JWT is decoded here and 'exp' field is extracted
|
'https://services.radio-canada.ca/ott/subscription/v2/gem/Subscriber/profile',
|
||||||
# It is a Unix timestamp for when the token expires
|
None, 'Downloading claims token', query={'device': 'web'},
|
||||||
b64_data = self._claims_token.split('.')[1]
|
headers={'Authorization': f'Bearer {self._fetch_access_token()}'})['claimsToken']
|
||||||
data = base64.urlsafe_b64decode(b64_data + '==')
|
|
||||||
return json.loads(data)['exp']
|
|
||||||
|
|
||||||
def claims_token_expired(self):
|
|
||||||
exp = self._get_claims_token_expiry()
|
|
||||||
# It will expire in less than 10 seconds, or has already expired
|
|
||||||
return exp - time.time() < 10
|
|
||||||
|
|
||||||
def claims_token_valid(self):
|
|
||||||
return self._claims_token is not None and not self.claims_token_expired()
|
|
||||||
|
|
||||||
def _get_claims_token(self, email, password):
|
|
||||||
if not self.claims_token_valid():
|
|
||||||
self._claims_token = self._new_claims_token(email, password)
|
|
||||||
self.cache.store(self._NETRC_MACHINE, 'claims_token', self._claims_token)
|
self.cache.store(self._NETRC_MACHINE, 'claims_token', self._claims_token)
|
||||||
|
else:
|
||||||
|
self.write_debug('Using cached claims token')
|
||||||
|
|
||||||
return self._claims_token
|
return self._claims_token
|
||||||
|
|
||||||
def _real_initialize(self):
|
|
||||||
if self.claims_token_valid():
|
|
||||||
return
|
|
||||||
self._claims_token = self.cache.load(self._NETRC_MACHINE, 'claims_token')
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id, season_number = self._match_valid_url(url).group('id', 'season')
|
||||||
video_info = self._download_json(
|
video_info = self._call_show_api(video_id)
|
||||||
f'https://services.radio-canada.ca/ott/cbc-api/v2/assets/{video_id}',
|
item_info = traverse_obj(video_info, (
|
||||||
video_id, expected_status=426)
|
'content', ..., 'lineups', ..., 'items',
|
||||||
|
lambda _, v: v['url'] == video_id, any, {require('item info')}))
|
||||||
|
|
||||||
email, password = self._get_login_info()
|
|
||||||
if email and password:
|
|
||||||
claims_token = self._get_claims_token(email, password)
|
|
||||||
headers = {'x-claims-token': claims_token}
|
|
||||||
else:
|
|
||||||
headers = {}
|
headers = {}
|
||||||
m3u8_info = self._download_json(video_info['playSession']['url'], video_id, headers=headers)
|
if claims_token := self._fetch_claims_token():
|
||||||
|
headers['x-claims-token'] = claims_token
|
||||||
|
|
||||||
|
m3u8_info = self._download_json(
|
||||||
|
'https://services.radio-canada.ca/media/validation/v2/',
|
||||||
|
video_id, headers=headers, query={
|
||||||
|
'appCode': 'gem',
|
||||||
|
'connectionType': 'hd',
|
||||||
|
'deviceType': 'ipad',
|
||||||
|
'multibitrate': 'true',
|
||||||
|
'output': 'json',
|
||||||
|
'tech': 'hls',
|
||||||
|
'manifestVersion': '2',
|
||||||
|
'manifestType': 'desktop',
|
||||||
|
'idMedia': item_info['idMedia'],
|
||||||
|
})
|
||||||
|
|
||||||
if m3u8_info.get('errorCode') == 1:
|
if m3u8_info.get('errorCode') == 1:
|
||||||
self.raise_geo_restricted(countries=['CA'])
|
self.raise_geo_restricted(countries=['CA'])
|
||||||
@ -671,26 +735,20 @@ def _real_extract(self, url):
|
|||||||
fmt['preference'] = -2
|
fmt['preference'] = -2
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
'season_number': int_or_none(season_number),
|
||||||
|
**traverse_obj(video_info, {
|
||||||
|
'series': ('title', {str}),
|
||||||
|
'season_number': ('structuredMetadata', 'partofSeason', 'seasonNumber', {int_or_none}),
|
||||||
|
'genres': ('structuredMetadata', 'genre', ..., {str}),
|
||||||
|
}),
|
||||||
|
**self._extract_item_info(item_info),
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'episode_id': video_id,
|
'episode_id': video_id,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
**traverse_obj(video_info, {
|
|
||||||
'title': ('title', {str}),
|
|
||||||
'episode': ('title', {str}),
|
|
||||||
'description': ('description', {str}),
|
|
||||||
'thumbnail': ('image', {url_or_none}),
|
|
||||||
'series': ('series', {str}),
|
|
||||||
'season_number': ('season', {int_or_none}),
|
|
||||||
'episode_number': ('episode', {int_or_none}),
|
|
||||||
'duration': ('duration', {int_or_none}),
|
|
||||||
'categories': ('category', {str}, all),
|
|
||||||
'release_timestamp': ('airDate', {int_or_none(scale=1000)}),
|
|
||||||
'timestamp': ('availableDate', {int_or_none(scale=1000)}),
|
|
||||||
}),
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class CBCGemPlaylistIE(InfoExtractor):
|
class CBCGemPlaylistIE(CBCGemBaseIE):
|
||||||
IE_NAME = 'gem.cbc.ca:playlist'
|
IE_NAME = 'gem.cbc.ca:playlist'
|
||||||
_VALID_URL = r'https?://gem\.cbc\.ca/(?:media/)?(?P<id>(?P<show>[0-9a-z-]+)/s(?P<season>[0-9]+))/?(?:[?#]|$)'
|
_VALID_URL = r'https?://gem\.cbc\.ca/(?:media/)?(?P<id>(?P<show>[0-9a-z-]+)/s(?P<season>[0-9]+))/?(?:[?#]|$)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
@ -700,70 +758,35 @@ class CBCGemPlaylistIE(InfoExtractor):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'schitts-creek/s06',
|
'id': 'schitts-creek/s06',
|
||||||
'title': 'Season 6',
|
'title': 'Season 6',
|
||||||
'description': 'md5:6a92104a56cbeb5818cc47884d4326a2',
|
|
||||||
'series': 'Schitt\'s Creek',
|
'series': 'Schitt\'s Creek',
|
||||||
'season_number': 6,
|
'season_number': 6,
|
||||||
'season': 'Season 6',
|
'season': 'Season 6',
|
||||||
'thumbnail': 'https://images.radio-canada.ca/v1/synps-cbc/season/perso/cbc_schitts_creek_season_06_carousel_v03.jpg?impolicy=ott&im=Resize=(_Size_)&quality=75',
|
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://gem.cbc.ca/schitts-creek/s06',
|
'url': 'https://gem.cbc.ca/schitts-creek/s06',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
_API_BASE = 'https://services.radio-canada.ca/ott/cbc-api/v2/shows/'
|
|
||||||
|
def _entries(self, season_info):
|
||||||
|
for episode in traverse_obj(season_info, ('items', lambda _, v: v['url'])):
|
||||||
|
yield self.url_result(
|
||||||
|
f'https://gem.cbc.ca/media/{episode["url"]}', CBCGemIE,
|
||||||
|
**self._extract_item_info(episode))
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
match = self._match_valid_url(url)
|
season_id, show, season = self._match_valid_url(url).group('id', 'show', 'season')
|
||||||
season_id = match.group('id')
|
show_info = self._call_show_api(show, display_id=season_id)
|
||||||
show = match.group('show')
|
season_info = traverse_obj(show_info, (
|
||||||
show_info = self._download_json(self._API_BASE + show, season_id, expected_status=426)
|
'content', ..., 'lineups',
|
||||||
season = int(match.group('season'))
|
lambda _, v: v['seasonNumber'] == int(season), any, {require('season info')}))
|
||||||
|
|
||||||
season_info = next((s for s in show_info['seasons'] if s.get('season') == season), None)
|
return self.playlist_result(
|
||||||
|
self._entries(season_info), season_id,
|
||||||
if season_info is None:
|
**traverse_obj(season_info, {
|
||||||
raise ExtractorError(f'Couldn\'t find season {season} of {show}')
|
'title': ('title', {str}),
|
||||||
|
'season': ('title', {str}),
|
||||||
episodes = []
|
'season_number': ('seasonNumber', {int_or_none}),
|
||||||
for episode in season_info['assets']:
|
}), series=traverse_obj(show_info, ('title', {str})))
|
||||||
episodes.append({
|
|
||||||
'_type': 'url_transparent',
|
|
||||||
'ie_key': 'CBCGem',
|
|
||||||
'url': 'https://gem.cbc.ca/media/' + episode['id'],
|
|
||||||
'id': episode['id'],
|
|
||||||
'title': episode.get('title'),
|
|
||||||
'description': episode.get('description'),
|
|
||||||
'thumbnail': episode.get('image'),
|
|
||||||
'series': episode.get('series'),
|
|
||||||
'season_number': episode.get('season'),
|
|
||||||
'season': season_info['title'],
|
|
||||||
'season_id': season_info.get('id'),
|
|
||||||
'episode_number': episode.get('episode'),
|
|
||||||
'episode': episode.get('title'),
|
|
||||||
'episode_id': episode['id'],
|
|
||||||
'duration': episode.get('duration'),
|
|
||||||
'categories': [episode.get('category')],
|
|
||||||
})
|
|
||||||
|
|
||||||
thumbnail = None
|
|
||||||
tn_uri = season_info.get('image')
|
|
||||||
# the-national was observed to use a "data:image/png;base64"
|
|
||||||
# URI for their 'image' value. The image was 1x1, and is
|
|
||||||
# probably just a placeholder, so it is ignored.
|
|
||||||
if tn_uri is not None and not tn_uri.startswith('data:'):
|
|
||||||
thumbnail = tn_uri
|
|
||||||
|
|
||||||
return {
|
|
||||||
'_type': 'playlist',
|
|
||||||
'entries': episodes,
|
|
||||||
'id': season_id,
|
|
||||||
'title': season_info['title'],
|
|
||||||
'description': season_info.get('description'),
|
|
||||||
'thumbnail': thumbnail,
|
|
||||||
'series': show_info.get('title'),
|
|
||||||
'season_number': season_info.get('season'),
|
|
||||||
'season': season_info['title'],
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class CBCGemLiveIE(InfoExtractor):
|
class CBCGemLiveIE(InfoExtractor):
|
||||||
|
@ -2,7 +2,6 @@
|
|||||||
import collections
|
import collections
|
||||||
import functools
|
import functools
|
||||||
import getpass
|
import getpass
|
||||||
import hashlib
|
|
||||||
import http.client
|
import http.client
|
||||||
import http.cookiejar
|
import http.cookiejar
|
||||||
import http.cookies
|
import http.cookies
|
||||||
@ -30,6 +29,7 @@
|
|||||||
from ..cookies import LenientSimpleCookie
|
from ..cookies import LenientSimpleCookie
|
||||||
from ..downloader.f4m import get_base_url, remove_encrypted_media
|
from ..downloader.f4m import get_base_url, remove_encrypted_media
|
||||||
from ..downloader.hls import HlsFD
|
from ..downloader.hls import HlsFD
|
||||||
|
from ..globals import plugin_ies_overrides
|
||||||
from ..networking import HEADRequest, Request
|
from ..networking import HEADRequest, Request
|
||||||
from ..networking.exceptions import (
|
from ..networking.exceptions import (
|
||||||
HTTPError,
|
HTTPError,
|
||||||
@ -78,7 +78,6 @@
|
|||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
parse_m3u8_attributes,
|
parse_m3u8_attributes,
|
||||||
parse_resolution,
|
parse_resolution,
|
||||||
sanitize_filename,
|
|
||||||
sanitize_url,
|
sanitize_url,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
str_or_none,
|
str_or_none,
|
||||||
@ -100,6 +99,7 @@
|
|||||||
xpath_text,
|
xpath_text,
|
||||||
xpath_with_ns,
|
xpath_with_ns,
|
||||||
)
|
)
|
||||||
|
from ..utils._utils import _request_dump_filename
|
||||||
|
|
||||||
|
|
||||||
class InfoExtractor:
|
class InfoExtractor:
|
||||||
@ -201,6 +201,11 @@ class InfoExtractor:
|
|||||||
fragment_base_url
|
fragment_base_url
|
||||||
* "duration" (optional, int or float)
|
* "duration" (optional, int or float)
|
||||||
* "filesize" (optional, int)
|
* "filesize" (optional, int)
|
||||||
|
* hls_media_playlist_data
|
||||||
|
The M3U8 media playlist data as a string.
|
||||||
|
Only use if the data must be modified during extraction and
|
||||||
|
the native HLS downloader should bypass requesting the URL.
|
||||||
|
Does not apply if ffmpeg is used as external downloader
|
||||||
* is_from_start Is a live format that can be downloaded
|
* is_from_start Is a live format that can be downloaded
|
||||||
from the start. Boolean
|
from the start. Boolean
|
||||||
* preference Order number of this format. If this field is
|
* preference Order number of this format. If this field is
|
||||||
@ -1017,23 +1022,6 @@ def __check_blocked(self, content):
|
|||||||
'Visit http://blocklist.rkn.gov.ru/ for a block reason.',
|
'Visit http://blocklist.rkn.gov.ru/ for a block reason.',
|
||||||
expected=True)
|
expected=True)
|
||||||
|
|
||||||
def _request_dump_filename(self, url, video_id, data=None):
|
|
||||||
if data is not None:
|
|
||||||
data = hashlib.md5(data).hexdigest()
|
|
||||||
basen = join_nonempty(video_id, data, url, delim='_')
|
|
||||||
trim_length = self.get_param('trim_file_name') or 240
|
|
||||||
if len(basen) > trim_length:
|
|
||||||
h = '___' + hashlib.md5(basen.encode()).hexdigest()
|
|
||||||
basen = basen[:trim_length - len(h)] + h
|
|
||||||
filename = sanitize_filename(f'{basen}.dump', restricted=True)
|
|
||||||
# Working around MAX_PATH limitation on Windows (see
|
|
||||||
# http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx)
|
|
||||||
if os.name == 'nt':
|
|
||||||
absfilepath = os.path.abspath(filename)
|
|
||||||
if len(absfilepath) > 259:
|
|
||||||
filename = fR'\\?\{absfilepath}'
|
|
||||||
return filename
|
|
||||||
|
|
||||||
def __decode_webpage(self, webpage_bytes, encoding, headers):
|
def __decode_webpage(self, webpage_bytes, encoding, headers):
|
||||||
if not encoding:
|
if not encoding:
|
||||||
encoding = self._guess_encoding_from_content(headers.get('Content-Type', ''), webpage_bytes)
|
encoding = self._guess_encoding_from_content(headers.get('Content-Type', ''), webpage_bytes)
|
||||||
@ -1062,7 +1050,9 @@ def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errno
|
|||||||
if self.get_param('write_pages'):
|
if self.get_param('write_pages'):
|
||||||
if isinstance(url_or_request, Request):
|
if isinstance(url_or_request, Request):
|
||||||
data = self._create_request(url_or_request, data).data
|
data = self._create_request(url_or_request, data).data
|
||||||
filename = self._request_dump_filename(urlh.url, video_id, data)
|
filename = _request_dump_filename(
|
||||||
|
urlh.url, video_id, data,
|
||||||
|
trim_length=self.get_param('trim_file_name'))
|
||||||
self.to_screen(f'Saving request to {filename}')
|
self.to_screen(f'Saving request to {filename}')
|
||||||
with open(filename, 'wb') as outf:
|
with open(filename, 'wb') as outf:
|
||||||
outf.write(webpage_bytes)
|
outf.write(webpage_bytes)
|
||||||
@ -1123,7 +1113,9 @@ def download_content(self, url_or_request, video_id, note=note, errnote=errnote,
|
|||||||
impersonate=None, require_impersonation=False):
|
impersonate=None, require_impersonation=False):
|
||||||
if self.get_param('load_pages'):
|
if self.get_param('load_pages'):
|
||||||
url_or_request = self._create_request(url_or_request, data, headers, query)
|
url_or_request = self._create_request(url_or_request, data, headers, query)
|
||||||
filename = self._request_dump_filename(url_or_request.url, video_id, url_or_request.data)
|
filename = _request_dump_filename(
|
||||||
|
url_or_request.url, video_id, url_or_request.data,
|
||||||
|
trim_length=self.get_param('trim_file_name'))
|
||||||
self.to_screen(f'Loading request from {filename}')
|
self.to_screen(f'Loading request from {filename}')
|
||||||
try:
|
try:
|
||||||
with open(filename, 'rb') as dumpf:
|
with open(filename, 'rb') as dumpf:
|
||||||
@ -3963,14 +3955,18 @@ def _extract_url(cls, webpage): # TODO: Remove
|
|||||||
def __init_subclass__(cls, *, plugin_name=None, **kwargs):
|
def __init_subclass__(cls, *, plugin_name=None, **kwargs):
|
||||||
if plugin_name:
|
if plugin_name:
|
||||||
mro = inspect.getmro(cls)
|
mro = inspect.getmro(cls)
|
||||||
super_class = cls.__wrapped__ = mro[mro.index(cls) + 1]
|
next_mro_class = super_class = mro[mro.index(cls) + 1]
|
||||||
cls.PLUGIN_NAME, cls.ie_key = plugin_name, super_class.ie_key
|
|
||||||
cls.IE_NAME = f'{super_class.IE_NAME}+{plugin_name}'
|
|
||||||
while getattr(super_class, '__wrapped__', None):
|
while getattr(super_class, '__wrapped__', None):
|
||||||
super_class = super_class.__wrapped__
|
super_class = super_class.__wrapped__
|
||||||
setattr(sys.modules[super_class.__module__], super_class.__name__, cls)
|
|
||||||
_PLUGIN_OVERRIDES[super_class].append(cls)
|
|
||||||
|
|
||||||
|
if not any(override.PLUGIN_NAME == plugin_name for override in plugin_ies_overrides.value[super_class]):
|
||||||
|
cls.__wrapped__ = next_mro_class
|
||||||
|
cls.PLUGIN_NAME, cls.ie_key = plugin_name, next_mro_class.ie_key
|
||||||
|
cls.IE_NAME = f'{next_mro_class.IE_NAME}+{plugin_name}'
|
||||||
|
|
||||||
|
setattr(sys.modules[super_class.__module__], super_class.__name__, cls)
|
||||||
|
plugin_ies_overrides.value[super_class].append(cls)
|
||||||
return super().__init_subclass__(**kwargs)
|
return super().__init_subclass__(**kwargs)
|
||||||
|
|
||||||
|
|
||||||
@ -4026,6 +4022,3 @@ class UnsupportedURLIE(InfoExtractor):
|
|||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
raise UnsupportedError(url)
|
raise UnsupportedError(url)
|
||||||
|
|
||||||
|
|
||||||
_PLUGIN_OVERRIDES = collections.defaultdict(list)
|
|
||||||
|
@ -3,7 +3,7 @@
|
|||||||
|
|
||||||
|
|
||||||
class CultureUnpluggedIE(InfoExtractor):
|
class CultureUnpluggedIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?cultureunplugged\.com/documentary/watch-online/play/(?P<id>\d+)(?:/(?P<display_id>[^/]+))?'
|
_VALID_URL = r'https?://(?:www\.)?cultureunplugged\.com/(?:documentary/watch-online/)?play/(?P<id>\d+)(?:/(?P<display_id>[^/#?]+))?'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662/The-Next--Best-West',
|
'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662/The-Next--Best-West',
|
||||||
'md5': 'ac6c093b089f7d05e79934dcb3d228fc',
|
'md5': 'ac6c093b089f7d05e79934dcb3d228fc',
|
||||||
@ -12,12 +12,25 @@ class CultureUnpluggedIE(InfoExtractor):
|
|||||||
'display_id': 'The-Next--Best-West',
|
'display_id': 'The-Next--Best-West',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'The Next, Best West',
|
'title': 'The Next, Best West',
|
||||||
'description': 'md5:0423cd00833dea1519cf014e9d0903b1',
|
'description': 'md5:770033a3b7c2946a3bcfb7f1c6fb7045',
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
'creator': 'Coldstream Creative',
|
'creators': ['Coldstream Creative'],
|
||||||
'duration': 2203,
|
'duration': 2203,
|
||||||
'view_count': int,
|
'view_count': int,
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.cultureunplugged.com/play/2833/Koi-Sunta-Hai--Journeys-with-Kumar---Kabir--Someone-is-Listening-',
|
||||||
|
'md5': 'dc2014bc470dfccba389a1c934fa29fa',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '2833',
|
||||||
|
'display_id': 'Koi-Sunta-Hai--Journeys-with-Kumar---Kabir--Someone-is-Listening-',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Koi Sunta Hai: Journeys with Kumar & Kabir (Someone is Listening)',
|
||||||
|
'description': 'md5:fa94ac934927c98660362b8285b2cda5',
|
||||||
|
'view_count': int,
|
||||||
|
'thumbnail': 'https://s3.amazonaws.com/cdn.cultureunplugged.com/thumbnails_16_9/lg/2833.jpg',
|
||||||
|
'creators': ['Srishti'],
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662',
|
'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
@ -100,7 +100,7 @@ def _call_api(self, object_type, xid, object_fields, note, filter_extra=None):
|
|||||||
|
|
||||||
class DailymotionIE(DailymotionBaseInfoExtractor):
|
class DailymotionIE(DailymotionBaseInfoExtractor):
|
||||||
_VALID_URL = r'''(?ix)
|
_VALID_URL = r'''(?ix)
|
||||||
https?://
|
(?:https?:)?//
|
||||||
(?:
|
(?:
|
||||||
dai\.ly/|
|
dai\.ly/|
|
||||||
(?:
|
(?:
|
||||||
@ -116,7 +116,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
|
|||||||
(?P<id>[^/?_&#]+)(?:[\w-]*\?playlist=(?P<playlist_id>x[0-9a-z]+))?
|
(?P<id>[^/?_&#]+)(?:[\w-]*\?playlist=(?P<playlist_id>x[0-9a-z]+))?
|
||||||
'''
|
'''
|
||||||
IE_NAME = 'dailymotion'
|
IE_NAME = 'dailymotion'
|
||||||
_EMBED_REGEX = [r'<(?:(?:embed|iframe)[^>]+?src=|input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=)(["\'])(?P<url>(?:https?:)?//(?:www\.)?dailymotion\.com/(?:embed|swf)/video/.+?)\1']
|
_EMBED_REGEX = [rf'(?ix)<(?:(?:embed|iframe)[^>]+?src=|input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=)["\'](?P<url>{_VALID_URL[5:]})']
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news',
|
'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news',
|
||||||
'md5': '074b95bdee76b9e3654137aee9c79dfe',
|
'md5': '074b95bdee76b9e3654137aee9c79dfe',
|
||||||
@ -308,6 +308,25 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
|
|||||||
'description': 'Que lindura',
|
'description': 'Que lindura',
|
||||||
'tags': [],
|
'tags': [],
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
# //geo.dailymotion.com/player/xysxq.html?video=k2Y4Mjp7krAF9iCuINM
|
||||||
|
'url': 'https://lcp.fr/programmes/avant-la-catastrophe-la-naissance-de-la-dictature-nazie-1933-1936-346819',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'k2Y4Mjp7krAF9iCuINM',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Avant la catastrophe la naissance de la dictature nazie 1933 -1936',
|
||||||
|
'description': 'md5:7b620d5e26edbe45f27bbddc1c0257c1',
|
||||||
|
'uploader': 'LCP Assemblée nationale',
|
||||||
|
'uploader_id': 'xbz33d',
|
||||||
|
'view_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
'age_limit': 0,
|
||||||
|
'duration': 3220,
|
||||||
|
'thumbnail': 'https://s1.dmcdn.net/v/Xvumk1djJBUZfjj2a/x1080',
|
||||||
|
'tags': [],
|
||||||
|
'timestamp': 1739919947,
|
||||||
|
'upload_date': '20250218',
|
||||||
|
},
|
||||||
}]
|
}]
|
||||||
_GEO_BYPASS = False
|
_GEO_BYPASS = False
|
||||||
_COMMON_MEDIA_FIELDS = '''description
|
_COMMON_MEDIA_FIELDS = '''description
|
||||||
|
130
yt_dlp/extractor/digiview.py
Normal file
130
yt_dlp/extractor/digiview.py
Normal file
@ -0,0 +1,130 @@
|
|||||||
|
from .common import InfoExtractor
|
||||||
|
from .youtube import YoutubeIE
|
||||||
|
from ..utils import clean_html, int_or_none, traverse_obj, url_or_none, urlencode_postdata
|
||||||
|
|
||||||
|
|
||||||
|
class DigiviewIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?ladigitale\.dev/digiview/#/v/(?P<id>[0-9a-f]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
# normal video
|
||||||
|
'url': 'https://ladigitale.dev/digiview/#/v/67a8e50aee2ec',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '67a8e50aee2ec',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film',
|
||||||
|
'thumbnail': 'https://i.ytimg.com/vi/aqz-KE-bpKQ/hqdefault.jpg',
|
||||||
|
'upload_date': '20141110',
|
||||||
|
'playable_in_embed': True,
|
||||||
|
'duration': 635,
|
||||||
|
'view_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'channel': 'Blender',
|
||||||
|
'license': 'Creative Commons Attribution license (reuse allowed)',
|
||||||
|
'like_count': int,
|
||||||
|
'tags': 'count:8',
|
||||||
|
'live_status': 'not_live',
|
||||||
|
'channel_id': 'UCSMOQeBJ2RAnuFungnQOxLg',
|
||||||
|
'channel_follower_count': int,
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UCSMOQeBJ2RAnuFungnQOxLg',
|
||||||
|
'uploader_id': '@BlenderOfficial',
|
||||||
|
'description': 'md5:8f3ed18a53a1bb36cbb3b70a15782fd0',
|
||||||
|
'categories': ['Film & Animation'],
|
||||||
|
'channel_is_verified': True,
|
||||||
|
'heatmap': 'count:100',
|
||||||
|
'section_end': 635,
|
||||||
|
'uploader': 'Blender',
|
||||||
|
'timestamp': 1415628355,
|
||||||
|
'uploader_url': 'https://www.youtube.com/@BlenderOfficial',
|
||||||
|
'age_limit': 0,
|
||||||
|
'section_start': 0,
|
||||||
|
'availability': 'public',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
# cut video
|
||||||
|
'url': 'https://ladigitale.dev/digiview/#/v/67a8e51d0dd58',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '67a8e51d0dd58',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film',
|
||||||
|
'thumbnail': 'https://i.ytimg.com/vi/aqz-KE-bpKQ/hqdefault.jpg',
|
||||||
|
'upload_date': '20141110',
|
||||||
|
'playable_in_embed': True,
|
||||||
|
'duration': 5,
|
||||||
|
'view_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'channel': 'Blender',
|
||||||
|
'license': 'Creative Commons Attribution license (reuse allowed)',
|
||||||
|
'like_count': int,
|
||||||
|
'tags': 'count:8',
|
||||||
|
'live_status': 'not_live',
|
||||||
|
'channel_id': 'UCSMOQeBJ2RAnuFungnQOxLg',
|
||||||
|
'channel_follower_count': int,
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UCSMOQeBJ2RAnuFungnQOxLg',
|
||||||
|
'uploader_id': '@BlenderOfficial',
|
||||||
|
'description': 'md5:8f3ed18a53a1bb36cbb3b70a15782fd0',
|
||||||
|
'categories': ['Film & Animation'],
|
||||||
|
'channel_is_verified': True,
|
||||||
|
'heatmap': 'count:100',
|
||||||
|
'section_end': 10,
|
||||||
|
'uploader': 'Blender',
|
||||||
|
'timestamp': 1415628355,
|
||||||
|
'uploader_url': 'https://www.youtube.com/@BlenderOfficial',
|
||||||
|
'age_limit': 0,
|
||||||
|
'section_start': 5,
|
||||||
|
'availability': 'public',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
# changed title
|
||||||
|
'url': 'https://ladigitale.dev/digiview/#/v/67a8ea5644d7a',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '67a8ea5644d7a',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Big Buck Bunny (with title changed)',
|
||||||
|
'thumbnail': 'https://i.ytimg.com/vi/aqz-KE-bpKQ/hqdefault.jpg',
|
||||||
|
'upload_date': '20141110',
|
||||||
|
'playable_in_embed': True,
|
||||||
|
'duration': 5,
|
||||||
|
'view_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'channel': 'Blender',
|
||||||
|
'license': 'Creative Commons Attribution license (reuse allowed)',
|
||||||
|
'like_count': int,
|
||||||
|
'tags': 'count:8',
|
||||||
|
'live_status': 'not_live',
|
||||||
|
'channel_id': 'UCSMOQeBJ2RAnuFungnQOxLg',
|
||||||
|
'channel_follower_count': int,
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UCSMOQeBJ2RAnuFungnQOxLg',
|
||||||
|
'uploader_id': '@BlenderOfficial',
|
||||||
|
'description': 'md5:8f3ed18a53a1bb36cbb3b70a15782fd0',
|
||||||
|
'categories': ['Film & Animation'],
|
||||||
|
'channel_is_verified': True,
|
||||||
|
'heatmap': 'count:100',
|
||||||
|
'section_end': 15,
|
||||||
|
'uploader': 'Blender',
|
||||||
|
'timestamp': 1415628355,
|
||||||
|
'uploader_url': 'https://www.youtube.com/@BlenderOfficial',
|
||||||
|
'age_limit': 0,
|
||||||
|
'section_start': 10,
|
||||||
|
'availability': 'public',
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
video_data = self._download_json(
|
||||||
|
'https://ladigitale.dev/digiview/inc/recuperer_video.php', video_id,
|
||||||
|
data=urlencode_postdata({'id': video_id}))
|
||||||
|
|
||||||
|
clip_id = video_data['videoId']
|
||||||
|
return self.url_result(
|
||||||
|
f'https://www.youtube.com/watch?v={clip_id}',
|
||||||
|
YoutubeIE, video_id, url_transparent=True,
|
||||||
|
**traverse_obj(video_data, {
|
||||||
|
'section_start': ('debut', {int_or_none}),
|
||||||
|
'section_end': ('fin', {int_or_none}),
|
||||||
|
'description': ('description', {clean_html}, filter),
|
||||||
|
'title': ('titre', {str}),
|
||||||
|
'thumbnail': ('vignette', {url_or_none}),
|
||||||
|
'view_count': ('vues', {int_or_none}),
|
||||||
|
}),
|
||||||
|
)
|
@ -1,10 +1,24 @@
|
|||||||
from .zdf import ZDFIE
|
from .zdf import ZDFBaseIE
|
||||||
|
|
||||||
|
|
||||||
class DreiSatIE(ZDFIE): # XXX: Do not subclass from concrete IE
|
class DreiSatIE(ZDFBaseIE):
|
||||||
IE_NAME = '3sat'
|
IE_NAME = '3sat'
|
||||||
_VALID_URL = r'https?://(?:www\.)?3sat\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html'
|
_VALID_URL = r'https?://(?:www\.)?3sat\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
|
'url': 'https://www.3sat.de/dokumentation/reise/traumziele-suedostasiens-die-philippinen-und-vietnam-102.html',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '231124_traumziele_philippinen_und_vietnam_dokreise',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam',
|
||||||
|
'description': 'md5:26329ce5197775b596773b939354079d',
|
||||||
|
'duration': 2625.0,
|
||||||
|
'thumbnail': 'https://www.3sat.de/assets/traumziele-suedostasiens-die-philippinen-und-vietnam-100~2400x1350?cb=1699870351148',
|
||||||
|
'episode': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam',
|
||||||
|
'episode_id': 'POS_cc7ff51c-98cf-4d12-b99d-f7a551de1c95',
|
||||||
|
'timestamp': 1738593000,
|
||||||
|
'upload_date': '20250203',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
# Same as https://www.zdf.de/dokumentation/ab-18/10-wochen-sommer-102.html
|
# Same as https://www.zdf.de/dokumentation/ab-18/10-wochen-sommer-102.html
|
||||||
'url': 'https://www.3sat.de/film/ab-18/10-wochen-sommer-108.html',
|
'url': 'https://www.3sat.de/film/ab-18/10-wochen-sommer-108.html',
|
||||||
'md5': '0aff3e7bc72c8813f5e0fae333316a1d',
|
'md5': '0aff3e7bc72c8813f5e0fae333316a1d',
|
||||||
@ -17,6 +31,7 @@ class DreiSatIE(ZDFIE): # XXX: Do not subclass from concrete IE
|
|||||||
'timestamp': 1608604200,
|
'timestamp': 1608604200,
|
||||||
'upload_date': '20201222',
|
'upload_date': '20201222',
|
||||||
},
|
},
|
||||||
|
'skip': '410 Gone',
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.3sat.de/gesellschaft/schweizweit/waidmannsheil-100.html',
|
'url': 'https://www.3sat.de/gesellschaft/schweizweit/waidmannsheil-100.html',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -30,6 +45,7 @@ class DreiSatIE(ZDFIE): # XXX: Do not subclass from concrete IE
|
|||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
|
'skip': '404 Not Found',
|
||||||
}, {
|
}, {
|
||||||
# Same as https://www.zdf.de/filme/filme-sonstige/der-hauptmann-112.html
|
# Same as https://www.zdf.de/filme/filme-sonstige/der-hauptmann-112.html
|
||||||
'url': 'https://www.3sat.de/film/spielfilm/der-hauptmann-100.html',
|
'url': 'https://www.3sat.de/film/spielfilm/der-hauptmann-100.html',
|
||||||
@ -39,3 +55,14 @@ class DreiSatIE(ZDFIE): # XXX: Do not subclass from concrete IE
|
|||||||
'url': 'https://www.3sat.de/wissen/nano/nano-21-mai-2019-102.html',
|
'url': 'https://www.3sat.de/wissen/nano/nano-21-mai-2019-102.html',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, video_id, fatal=False)
|
||||||
|
if webpage:
|
||||||
|
player = self._extract_player(webpage, url, fatal=False)
|
||||||
|
if player:
|
||||||
|
return self._extract_regular(url, player, video_id)
|
||||||
|
|
||||||
|
return self._extract_mobile(video_id)
|
||||||
|
@ -1,28 +1,37 @@
|
|||||||
import contextlib
|
import inspect
|
||||||
import os
|
import os
|
||||||
|
|
||||||
from ..plugins import load_plugins
|
from ..globals import LAZY_EXTRACTORS
|
||||||
|
from ..globals import extractors as _extractors_context
|
||||||
|
|
||||||
# NB: Must be before other imports so that plugins can be correctly injected
|
_CLASS_LOOKUP = None
|
||||||
_PLUGIN_CLASSES = load_plugins('extractor', 'IE')
|
if os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
|
||||||
|
LAZY_EXTRACTORS.value = False
|
||||||
|
else:
|
||||||
|
try:
|
||||||
|
from .lazy_extractors import _CLASS_LOOKUP
|
||||||
|
LAZY_EXTRACTORS.value = True
|
||||||
|
except ImportError:
|
||||||
|
LAZY_EXTRACTORS.value = None
|
||||||
|
|
||||||
_LAZY_LOADER = False
|
if not _CLASS_LOOKUP:
|
||||||
if not os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
|
from . import _extractors
|
||||||
with contextlib.suppress(ImportError):
|
|
||||||
from .lazy_extractors import * # noqa: F403
|
|
||||||
from .lazy_extractors import _ALL_CLASSES
|
|
||||||
_LAZY_LOADER = True
|
|
||||||
|
|
||||||
if not _LAZY_LOADER:
|
_CLASS_LOOKUP = {
|
||||||
from ._extractors import * # noqa: F403
|
name: value
|
||||||
_ALL_CLASSES = [ # noqa: F811
|
for name, value in inspect.getmembers(_extractors)
|
||||||
klass
|
|
||||||
for name, klass in globals().items()
|
|
||||||
if name.endswith('IE') and name != 'GenericIE'
|
if name.endswith('IE') and name != 'GenericIE'
|
||||||
]
|
}
|
||||||
_ALL_CLASSES.append(GenericIE) # noqa: F405
|
_CLASS_LOOKUP['GenericIE'] = _extractors.GenericIE
|
||||||
|
|
||||||
globals().update(_PLUGIN_CLASSES)
|
# We want to append to the main lookup
|
||||||
_ALL_CLASSES[:0] = _PLUGIN_CLASSES.values()
|
_current = _extractors_context.value
|
||||||
|
for name, ie in _CLASS_LOOKUP.items():
|
||||||
|
_current.setdefault(name, ie)
|
||||||
|
|
||||||
from .common import _PLUGIN_OVERRIDES # noqa: F401
|
|
||||||
|
def __getattr__(name):
|
||||||
|
value = _CLASS_LOOKUP.get(name)
|
||||||
|
if not value:
|
||||||
|
raise AttributeError(f'module {__name__} has no attribute {name}')
|
||||||
|
return value
|
||||||
|
@ -9,6 +9,7 @@
|
|||||||
ExtractorError,
|
ExtractorError,
|
||||||
clean_html,
|
clean_html,
|
||||||
determine_ext,
|
determine_ext,
|
||||||
|
extract_attributes,
|
||||||
filter_dict,
|
filter_dict,
|
||||||
format_field,
|
format_field,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
@ -18,7 +19,7 @@
|
|||||||
unsmuggle_url,
|
unsmuggle_url,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
)
|
)
|
||||||
from ..utils.traversal import traverse_obj
|
from ..utils.traversal import find_element, traverse_obj
|
||||||
|
|
||||||
|
|
||||||
class FranceTVBaseInfoExtractor(InfoExtractor):
|
class FranceTVBaseInfoExtractor(InfoExtractor):
|
||||||
@ -358,7 +359,8 @@ def _real_extract(self, url):
|
|||||||
# For livestreams we need the id of the stream instead of the currently airing episode id
|
# For livestreams we need the id of the stream instead of the currently airing episode id
|
||||||
video_id = traverse_obj(nextjs_data, (
|
video_id = traverse_obj(nextjs_data, (
|
||||||
..., ..., 'children', ..., 'children', ..., 'children', ..., 'children', ..., ...,
|
..., ..., 'children', ..., 'children', ..., 'children', ..., 'children', ..., ...,
|
||||||
'children', ..., ..., 'children', ..., ..., 'children', ..., 'options', 'id', {str}, any))
|
'children', ..., ..., 'children', ..., ..., 'children', (..., (..., ...)),
|
||||||
|
'options', 'id', {str}, any))
|
||||||
else:
|
else:
|
||||||
video_id = traverse_obj(nextjs_data, (
|
video_id = traverse_obj(nextjs_data, (
|
||||||
..., ..., ..., 'children',
|
..., ..., ..., 'children',
|
||||||
@ -459,11 +461,16 @@ def _real_extract(self, url):
|
|||||||
self.url_result(dailymotion_url, DailymotionIE.ie_key())
|
self.url_result(dailymotion_url, DailymotionIE.ie_key())
|
||||||
for dailymotion_url in dailymotion_urls])
|
for dailymotion_url in dailymotion_urls])
|
||||||
|
|
||||||
video_id = self._search_regex(
|
video_id = (
|
||||||
|
traverse_obj(webpage, (
|
||||||
|
{find_element(tag='button', attr='data-cy', value='francetv-player-wrapper', html=True)},
|
||||||
|
{extract_attributes}, 'id'))
|
||||||
|
or self._search_regex(
|
||||||
(r'player\.load[^;]+src:\s*["\']([^"\']+)',
|
(r'player\.load[^;]+src:\s*["\']([^"\']+)',
|
||||||
r'id-video=([^@]+@[^"]+)',
|
r'id-video=([^@]+@[^"]+)',
|
||||||
r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"',
|
r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"',
|
||||||
r'(?:data-id|<figure[^<]+\bid)=["\']([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'),
|
r'(?:data-id|<figure[^<]+\bid)=["\']([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'),
|
||||||
webpage, 'video id')
|
webpage, 'video id')
|
||||||
|
)
|
||||||
|
|
||||||
return self._make_url_result(video_id, url=url)
|
return self._make_url_result(video_id, url=url)
|
||||||
|
@ -293,6 +293,19 @@ class GenericIE(InfoExtractor):
|
|||||||
'timestamp': 1378272859.0,
|
'timestamp': 1378272859.0,
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
# Live DASH MPD
|
||||||
|
{
|
||||||
|
'url': 'https://livesim2.dashif.org/livesim2/ato_10/testpic_2s/Manifest.mpd',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'Manifest',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': r're:Manifest \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
|
||||||
|
'live_status': 'is_live',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': 'livestream',
|
||||||
|
},
|
||||||
|
},
|
||||||
# m3u8 served with Content-Type: audio/x-mpegURL; charset=utf-8
|
# m3u8 served with Content-Type: audio/x-mpegURL; charset=utf-8
|
||||||
{
|
{
|
||||||
'url': 'http://once.unicornmedia.com/now/master/playlist/bb0b18ba-64f5-4b1b-a29f-0ac252f06b68/77a785f3-5188-4806-b788-0893a61634ed/93677179-2d99-4ef4-9e17-fe70d49abfbf/content.m3u8',
|
'url': 'http://once.unicornmedia.com/now/master/playlist/bb0b18ba-64f5-4b1b-a29f-0ac252f06b68/77a785f3-5188-4806-b788-0893a61634ed/93677179-2d99-4ef4-9e17-fe70d49abfbf/content.m3u8',
|
||||||
@ -2436,10 +2449,9 @@ def _real_extract(self, url):
|
|||||||
subtitles = {}
|
subtitles = {}
|
||||||
if format_id.endswith('mpegurl') or ext == 'm3u8':
|
if format_id.endswith('mpegurl') or ext == 'm3u8':
|
||||||
formats, subtitles = self._extract_m3u8_formats_and_subtitles(url, video_id, 'mp4', headers=headers)
|
formats, subtitles = self._extract_m3u8_formats_and_subtitles(url, video_id, 'mp4', headers=headers)
|
||||||
elif format_id.endswith(('mpd', 'dash+xml')) or ext == 'mpd':
|
|
||||||
formats, subtitles = self._extract_mpd_formats_and_subtitles(url, video_id, headers=headers)
|
|
||||||
elif format_id == 'f4m' or ext == 'f4m':
|
elif format_id == 'f4m' or ext == 'f4m':
|
||||||
formats = self._extract_f4m_formats(url, video_id, headers=headers)
|
formats = self._extract_f4m_formats(url, video_id, headers=headers)
|
||||||
|
# Don't check for DASH/mpd here, do it later w/ first_bytes. Same number of requests either way
|
||||||
else:
|
else:
|
||||||
formats = [{
|
formats = [{
|
||||||
'format_id': format_id,
|
'format_id': format_id,
|
||||||
@ -2521,6 +2533,7 @@ def _real_extract(self, url):
|
|||||||
doc,
|
doc,
|
||||||
mpd_base_url=full_response.url.rpartition('/')[0],
|
mpd_base_url=full_response.url.rpartition('/')[0],
|
||||||
mpd_url=url)
|
mpd_url=url)
|
||||||
|
info_dict['live_status'] = 'is_live' if doc.get('type') == 'dynamic' else None
|
||||||
self._extra_manifest_info(info_dict, url)
|
self._extra_manifest_info(info_dict, url)
|
||||||
self.report_detected('DASH manifest')
|
self.report_detected('DASH manifest')
|
||||||
return info_dict
|
return info_dict
|
||||||
|
@ -69,8 +69,13 @@ class GloboIE(InfoExtractor):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '8013907',
|
'id': '8013907',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Capítulo de 14⧸08⧸1989',
|
'title': 'Capítulo de 14/08/1989',
|
||||||
|
'episode': 'Episode 1',
|
||||||
'episode_number': 1,
|
'episode_number': 1,
|
||||||
|
'uploader': 'Tieta',
|
||||||
|
'uploader_id': '11895',
|
||||||
|
'duration': 2858.389,
|
||||||
|
'subtitles': 'count:1',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -82,7 +87,12 @@ class GloboIE(InfoExtractor):
|
|||||||
'id': '12824146',
|
'id': '12824146',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Acordo de damas',
|
'title': 'Acordo de damas',
|
||||||
|
'episode': 'Episode 1',
|
||||||
'episode_number': 1,
|
'episode_number': 1,
|
||||||
|
'uploader': 'Rensga Hits!',
|
||||||
|
'uploader_id': '20481',
|
||||||
|
'duration': 1953.994,
|
||||||
|
'season': 'Season 2',
|
||||||
'season_number': 2,
|
'season_number': 2,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
@ -136,9 +146,10 @@ def _real_extract(self, url):
|
|||||||
else:
|
else:
|
||||||
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
|
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
|
||||||
main_source['url'], video_id, 'mp4', m3u8_id='hls')
|
main_source['url'], video_id, 'mp4', m3u8_id='hls')
|
||||||
self._merge_subtitles(traverse_obj(main_source, ('text', ..., {
|
|
||||||
'url': ('subtitle', 'srt', 'url', {url_or_none}),
|
self._merge_subtitles(traverse_obj(main_source, ('text', ..., ('caption', 'subtitle'), {
|
||||||
}, all, {subs_list_to_dict(lang='en')})), target=subtitles)
|
'url': ('srt', 'url', {url_or_none}),
|
||||||
|
}, all, {subs_list_to_dict(lang='pt-BR')})), target=subtitles)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
|
@ -2,12 +2,12 @@
|
|||||||
import itertools
|
import itertools
|
||||||
import json
|
import json
|
||||||
import re
|
import re
|
||||||
import time
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..networking.exceptions import HTTPError
|
from ..networking.exceptions import HTTPError
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
|
bug_reports_message,
|
||||||
decode_base_n,
|
decode_base_n,
|
||||||
encode_base_n,
|
encode_base_n,
|
||||||
filter_dict,
|
filter_dict,
|
||||||
@ -15,12 +15,12 @@
|
|||||||
format_field,
|
format_field,
|
||||||
get_element_by_attribute,
|
get_element_by_attribute,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
|
join_nonempty,
|
||||||
lowercase_escape,
|
lowercase_escape,
|
||||||
str_or_none,
|
str_or_none,
|
||||||
str_to_int,
|
str_to_int,
|
||||||
traverse_obj,
|
traverse_obj,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
urlencode_postdata,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
_ENCODING_CHARS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_'
|
_ENCODING_CHARS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_'
|
||||||
@ -28,64 +28,31 @@
|
|||||||
|
|
||||||
def _pk_to_id(media_id):
|
def _pk_to_id(media_id):
|
||||||
"""Source: https://stackoverflow.com/questions/24437823/getting-instagram-post-url-from-media-id"""
|
"""Source: https://stackoverflow.com/questions/24437823/getting-instagram-post-url-from-media-id"""
|
||||||
return encode_base_n(int(media_id.split('_')[0]), table=_ENCODING_CHARS)
|
pk = int(str(media_id).split('_')[0])
|
||||||
|
return encode_base_n(pk, table=_ENCODING_CHARS)
|
||||||
|
|
||||||
|
|
||||||
def _id_to_pk(shortcode):
|
def _id_to_pk(shortcode):
|
||||||
"""Covert a shortcode to a numeric value"""
|
"""Convert a shortcode to a numeric value"""
|
||||||
return decode_base_n(shortcode[:11], table=_ENCODING_CHARS)
|
if len(shortcode) > 28:
|
||||||
|
shortcode = shortcode[:-28]
|
||||||
|
return decode_base_n(shortcode, table=_ENCODING_CHARS)
|
||||||
|
|
||||||
|
|
||||||
class InstagramBaseIE(InfoExtractor):
|
class InstagramBaseIE(InfoExtractor):
|
||||||
_NETRC_MACHINE = 'instagram'
|
|
||||||
_IS_LOGGED_IN = False
|
|
||||||
|
|
||||||
_API_BASE_URL = 'https://i.instagram.com/api/v1'
|
_API_BASE_URL = 'https://i.instagram.com/api/v1'
|
||||||
_LOGIN_URL = 'https://www.instagram.com/accounts/login'
|
_LOGIN_URL = 'https://www.instagram.com/accounts/login'
|
||||||
_API_HEADERS = {
|
|
||||||
'X-IG-App-ID': '936619743392459',
|
@property
|
||||||
|
def _api_headers(self):
|
||||||
|
return {
|
||||||
|
'X-IG-App-ID': self._configuration_arg('app_id', ['936619743392459'], ie_key=InstagramIE)[0],
|
||||||
'X-ASBD-ID': '198387',
|
'X-ASBD-ID': '198387',
|
||||||
'X-IG-WWW-Claim': '0',
|
'X-IG-WWW-Claim': '0',
|
||||||
'Origin': 'https://www.instagram.com',
|
'Origin': 'https://www.instagram.com',
|
||||||
'Accept': '*/*',
|
'Accept': '*/*',
|
||||||
}
|
}
|
||||||
|
|
||||||
def _perform_login(self, username, password):
|
|
||||||
if self._IS_LOGGED_IN:
|
|
||||||
return
|
|
||||||
|
|
||||||
login_webpage = self._download_webpage(
|
|
||||||
self._LOGIN_URL, None, note='Downloading login webpage', errnote='Failed to download login webpage')
|
|
||||||
|
|
||||||
shared_data = self._parse_json(self._search_regex(
|
|
||||||
r'window\._sharedData\s*=\s*({.+?});', login_webpage, 'shared data', default='{}'), None)
|
|
||||||
|
|
||||||
login = self._download_json(
|
|
||||||
f'{self._LOGIN_URL}/ajax/', None, note='Logging in', headers={
|
|
||||||
**self._API_HEADERS,
|
|
||||||
'X-Requested-With': 'XMLHttpRequest',
|
|
||||||
'X-CSRFToken': shared_data['config']['csrf_token'],
|
|
||||||
'X-Instagram-AJAX': shared_data['rollout_hash'],
|
|
||||||
'Referer': 'https://www.instagram.com/',
|
|
||||||
}, data=urlencode_postdata({
|
|
||||||
'enc_password': f'#PWD_INSTAGRAM_BROWSER:0:{int(time.time())}:{password}',
|
|
||||||
'username': username,
|
|
||||||
'queryParams': '{}',
|
|
||||||
'optIntoOneTap': 'false',
|
|
||||||
'stopDeletionNonce': '',
|
|
||||||
'trustedDeviceRecords': '{}',
|
|
||||||
}))
|
|
||||||
|
|
||||||
if not login.get('authenticated'):
|
|
||||||
if login.get('message'):
|
|
||||||
raise ExtractorError(f'Unable to login: {login["message"]}')
|
|
||||||
elif login.get('user'):
|
|
||||||
raise ExtractorError('Unable to login: Sorry, your password was incorrect. Please double-check your password.', expected=True)
|
|
||||||
elif login.get('user') is False:
|
|
||||||
raise ExtractorError('Unable to login: The username you entered doesn\'t belong to an account. Please check your username and try again.', expected=True)
|
|
||||||
raise ExtractorError('Unable to login')
|
|
||||||
InstagramBaseIE._IS_LOGGED_IN = True
|
|
||||||
|
|
||||||
def _get_count(self, media, kind, *keys):
|
def _get_count(self, media, kind, *keys):
|
||||||
return traverse_obj(
|
return traverse_obj(
|
||||||
media, (kind, 'count'), *((f'edge_media_{key}', 'count') for key in keys),
|
media, (kind, 'count'), *((f'edge_media_{key}', 'count') for key in keys),
|
||||||
@ -209,7 +176,7 @@ def _extract_product(self, product_info):
|
|||||||
def _get_comments(self, video_id):
|
def _get_comments(self, video_id):
|
||||||
comments_info = self._download_json(
|
comments_info = self._download_json(
|
||||||
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/comments/?can_support_threading=true&permalink_enabled=false', video_id,
|
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/comments/?can_support_threading=true&permalink_enabled=false', video_id,
|
||||||
fatal=False, errnote='Comments extraction failed', note='Downloading comments info', headers=self._API_HEADERS) or {}
|
fatal=False, errnote='Comments extraction failed', note='Downloading comments info', headers=self._api_headers) or {}
|
||||||
|
|
||||||
comment_data = traverse_obj(comments_info, ('edge_media_to_parent_comment', 'edges'), 'comments')
|
comment_data = traverse_obj(comments_info, ('edge_media_to_parent_comment', 'edges'), 'comments')
|
||||||
for comment_dict in comment_data or []:
|
for comment_dict in comment_data or []:
|
||||||
@ -402,14 +369,14 @@ def _real_extract(self, url):
|
|||||||
info = traverse_obj(self._download_json(
|
info = traverse_obj(self._download_json(
|
||||||
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/info/', video_id,
|
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/info/', video_id,
|
||||||
fatal=False, errnote='Video info extraction failed',
|
fatal=False, errnote='Video info extraction failed',
|
||||||
note='Downloading video info', headers=self._API_HEADERS), ('items', 0))
|
note='Downloading video info', headers=self._api_headers), ('items', 0))
|
||||||
if info:
|
if info:
|
||||||
media.update(info)
|
media.update(info)
|
||||||
return self._extract_product(media)
|
return self._extract_product(media)
|
||||||
|
|
||||||
api_check = self._download_json(
|
api_check = self._download_json(
|
||||||
f'{self._API_BASE_URL}/web/get_ruling_for_content/?content_type=MEDIA&target_id={_id_to_pk(video_id)}',
|
f'{self._API_BASE_URL}/web/get_ruling_for_content/?content_type=MEDIA&target_id={_id_to_pk(video_id)}',
|
||||||
video_id, headers=self._API_HEADERS, fatal=False, note='Setting up session', errnote=False) or {}
|
video_id, headers=self._api_headers, fatal=False, note='Setting up session', errnote=False) or {}
|
||||||
csrf_token = self._get_cookies('https://www.instagram.com').get('csrftoken')
|
csrf_token = self._get_cookies('https://www.instagram.com').get('csrftoken')
|
||||||
|
|
||||||
if not csrf_token:
|
if not csrf_token:
|
||||||
@ -429,7 +396,7 @@ def _real_extract(self, url):
|
|||||||
general_info = self._download_json(
|
general_info = self._download_json(
|
||||||
'https://www.instagram.com/graphql/query/', video_id, fatal=False, errnote=False,
|
'https://www.instagram.com/graphql/query/', video_id, fatal=False, errnote=False,
|
||||||
headers={
|
headers={
|
||||||
**self._API_HEADERS,
|
**self._api_headers,
|
||||||
'X-CSRFToken': csrf_token or '',
|
'X-CSRFToken': csrf_token or '',
|
||||||
'X-Requested-With': 'XMLHttpRequest',
|
'X-Requested-With': 'XMLHttpRequest',
|
||||||
'Referer': url,
|
'Referer': url,
|
||||||
@ -437,7 +404,6 @@ def _real_extract(self, url):
|
|||||||
'doc_id': '8845758582119845',
|
'doc_id': '8845758582119845',
|
||||||
'variables': json.dumps(variables, separators=(',', ':')),
|
'variables': json.dumps(variables, separators=(',', ':')),
|
||||||
})
|
})
|
||||||
media.update(traverse_obj(general_info, ('data', 'xdt_shortcode_media')) or {})
|
|
||||||
|
|
||||||
if not general_info:
|
if not general_info:
|
||||||
self.report_warning('General metadata extraction failed (some metadata might be missing).', video_id)
|
self.report_warning('General metadata extraction failed (some metadata might be missing).', video_id)
|
||||||
@ -466,6 +432,26 @@ def _real_extract(self, url):
|
|||||||
media.update(traverse_obj(
|
media.update(traverse_obj(
|
||||||
additional_data, ('graphql', 'shortcode_media'), 'shortcode_media', expected_type=dict) or {})
|
additional_data, ('graphql', 'shortcode_media'), 'shortcode_media', expected_type=dict) or {})
|
||||||
|
|
||||||
|
else:
|
||||||
|
xdt_shortcode_media = traverse_obj(general_info, ('data', 'xdt_shortcode_media', {dict})) or {}
|
||||||
|
if not xdt_shortcode_media:
|
||||||
|
error = join_nonempty('title', 'description', delim=': ', from_dict=api_check)
|
||||||
|
if 'Restricted Video' in error:
|
||||||
|
self.raise_login_required(error)
|
||||||
|
elif error:
|
||||||
|
raise ExtractorError(error, expected=True)
|
||||||
|
elif len(video_id) > 28:
|
||||||
|
# It's a private post (video_id == shortcode + 28 extra characters)
|
||||||
|
# Only raise after getting empty response; sometimes "long"-shortcode posts are public
|
||||||
|
self.raise_login_required(
|
||||||
|
'This content is only available for registered users who follow this account')
|
||||||
|
raise ExtractorError(
|
||||||
|
'Instagram sent an empty media response. Check if this post is accessible in your '
|
||||||
|
f'browser without being logged-in. If it is not, then u{self._login_hint()[1:]}. '
|
||||||
|
'Otherwise, if the post is accessible in browser without being logged-in'
|
||||||
|
f'{bug_reports_message(before=",")}', expected=True)
|
||||||
|
media.update(xdt_shortcode_media)
|
||||||
|
|
||||||
username = traverse_obj(media, ('owner', 'username')) or self._search_regex(
|
username = traverse_obj(media, ('owner', 'username')) or self._search_regex(
|
||||||
r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"', webpage, 'username', fatal=False)
|
r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"', webpage, 'username', fatal=False)
|
||||||
|
|
||||||
@ -485,8 +471,7 @@ def _real_extract(self, url):
|
|||||||
return self.playlist_result(
|
return self.playlist_result(
|
||||||
self._extract_nodes(nodes, True), video_id,
|
self._extract_nodes(nodes, True), video_id,
|
||||||
format_field(username, None, 'Post by %s'), description)
|
format_field(username, None, 'Post by %s'), description)
|
||||||
|
raise ExtractorError('There is no video in this post', expected=True)
|
||||||
video_url = self._og_search_video_url(webpage, secure=False)
|
|
||||||
|
|
||||||
formats = [{
|
formats = [{
|
||||||
'url': video_url,
|
'url': video_url,
|
||||||
@ -689,7 +674,7 @@ def _query_vars_for(data):
|
|||||||
|
|
||||||
|
|
||||||
class InstagramStoryIE(InstagramBaseIE):
|
class InstagramStoryIE(InstagramBaseIE):
|
||||||
_VALID_URL = r'https?://(?:www\.)?instagram\.com/stories/(?P<user>[^/]+)/(?P<id>\d+)'
|
_VALID_URL = r'https?://(?:www\.)?instagram\.com/stories/(?P<user>[^/?#]+)(?:/(?P<id>\d+))?'
|
||||||
IE_NAME = 'instagram:story'
|
IE_NAME = 'instagram:story'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
@ -699,25 +684,38 @@ class InstagramStoryIE(InstagramBaseIE):
|
|||||||
'title': 'Rare',
|
'title': 'Rare',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 50,
|
'playlist_mincount': 50,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.instagram.com/stories/fruits_zipper/3570766765028588805/',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.instagram.com/stories/fruits_zipper',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
username, story_id = self._match_valid_url(url).groups()
|
username, story_id = self._match_valid_url(url).group('user', 'id')
|
||||||
story_info = self._download_webpage(url, story_id)
|
if username == 'highlights' and not story_id: # story id is only mandatory for highlights
|
||||||
user_info = self._search_json(r'"user":', story_info, 'user info', story_id, fatal=False)
|
raise ExtractorError('Input URL is missing a highlight ID', expected=True)
|
||||||
|
display_id = story_id or username
|
||||||
|
story_info = self._download_webpage(url, display_id)
|
||||||
|
user_info = self._search_json(r'"user":', story_info, 'user info', display_id, fatal=False)
|
||||||
if not user_info:
|
if not user_info:
|
||||||
self.raise_login_required('This content is unreachable')
|
self.raise_login_required('This content is unreachable')
|
||||||
|
|
||||||
user_id = traverse_obj(user_info, 'pk', 'id', expected_type=str)
|
user_id = traverse_obj(user_info, 'pk', 'id', expected_type=str)
|
||||||
story_info_url = user_id if username != 'highlights' else f'highlight:{story_id}'
|
if username == 'highlights':
|
||||||
if not story_info_url: # user id is only mandatory for non-highlights
|
story_info_url = f'highlight:{story_id}'
|
||||||
|
else:
|
||||||
|
if not user_id: # user id is only mandatory for non-highlights
|
||||||
raise ExtractorError('Unable to extract user id')
|
raise ExtractorError('Unable to extract user id')
|
||||||
|
story_info_url = user_id
|
||||||
|
|
||||||
videos = traverse_obj(self._download_json(
|
videos = traverse_obj(self._download_json(
|
||||||
f'{self._API_BASE_URL}/feed/reels_media/?reel_ids={story_info_url}',
|
f'{self._API_BASE_URL}/feed/reels_media/?reel_ids={story_info_url}',
|
||||||
story_id, errnote=False, fatal=False, headers=self._API_HEADERS), 'reels')
|
display_id, errnote=False, fatal=False, headers=self._api_headers), 'reels')
|
||||||
if not videos:
|
if not videos:
|
||||||
self.raise_login_required('You need to log in to access this content')
|
self.raise_login_required('You need to log in to access this content')
|
||||||
|
user_info = traverse_obj(videos, (user_id, 'user', {dict})) or {}
|
||||||
|
|
||||||
full_name = traverse_obj(videos, (f'highlight:{story_id}', 'user', 'full_name'), (user_id, 'user', 'full_name'))
|
full_name = traverse_obj(videos, (f'highlight:{story_id}', 'user', 'full_name'), (user_id, 'user', 'full_name'))
|
||||||
story_title = traverse_obj(videos, (f'highlight:{story_id}', 'title'))
|
story_title = traverse_obj(videos, (f'highlight:{story_id}', 'title'))
|
||||||
@ -727,6 +725,7 @@ def _real_extract(self, url):
|
|||||||
highlights = traverse_obj(videos, (f'highlight:{story_id}', 'items'), (user_id, 'items'))
|
highlights = traverse_obj(videos, (f'highlight:{story_id}', 'items'), (user_id, 'items'))
|
||||||
info_data = []
|
info_data = []
|
||||||
for highlight in highlights:
|
for highlight in highlights:
|
||||||
|
highlight.setdefault('user', {}).update(user_info)
|
||||||
highlight_data = self._extract_product(highlight)
|
highlight_data = self._extract_product(highlight)
|
||||||
if highlight_data.get('formats'):
|
if highlight_data.get('formats'):
|
||||||
info_data.append({
|
info_data.append({
|
||||||
@ -734,4 +733,7 @@ def _real_extract(self, url):
|
|||||||
'uploader_id': user_id,
|
'uploader_id': user_id,
|
||||||
**filter_dict(highlight_data),
|
**filter_dict(highlight_data),
|
||||||
})
|
})
|
||||||
|
if username != 'highlights' and story_id and not self._yes_playlist(username, story_id):
|
||||||
|
return traverse_obj(info_data, (lambda _, v: v['id'] == _pk_to_id(story_id), any))
|
||||||
|
|
||||||
return self.playlist_result(info_data, playlist_id=story_id, playlist_title=story_title)
|
return self.playlist_result(info_data, playlist_id=story_id, playlist_title=story_title)
|
||||||
|
@ -26,6 +26,7 @@ class LBRYBaseIE(InfoExtractor):
|
|||||||
_CLAIM_ID_REGEX = r'[0-9a-f]{1,40}'
|
_CLAIM_ID_REGEX = r'[0-9a-f]{1,40}'
|
||||||
_OPT_CLAIM_ID = f'[^$@:/?#&]+(?:[:#]{_CLAIM_ID_REGEX})?'
|
_OPT_CLAIM_ID = f'[^$@:/?#&]+(?:[:#]{_CLAIM_ID_REGEX})?'
|
||||||
_SUPPORTED_STREAM_TYPES = ['video', 'audio']
|
_SUPPORTED_STREAM_TYPES = ['video', 'audio']
|
||||||
|
_UNSUPPORTED_STREAM_TYPES = ['binary']
|
||||||
_PAGE_SIZE = 50
|
_PAGE_SIZE = 50
|
||||||
|
|
||||||
def _call_api_proxy(self, method, display_id, params, resource):
|
def _call_api_proxy(self, method, display_id, params, resource):
|
||||||
@ -336,12 +337,15 @@ def _real_extract(self, url):
|
|||||||
'vcodec': 'none' if stream_type == 'audio' else None,
|
'vcodec': 'none' if stream_type == 'audio' else None,
|
||||||
})
|
})
|
||||||
|
|
||||||
|
final_url = None
|
||||||
# HEAD request returns redirect response to m3u8 URL if available
|
# HEAD request returns redirect response to m3u8 URL if available
|
||||||
final_url = self._request_webpage(
|
urlh = self._request_webpage(
|
||||||
HEADRequest(streaming_url), display_id, headers=headers,
|
HEADRequest(streaming_url), display_id, headers=headers,
|
||||||
note='Downloading streaming redirect url info').url
|
note='Downloading streaming redirect url info', fatal=False)
|
||||||
|
if urlh:
|
||||||
|
final_url = urlh.url
|
||||||
|
|
||||||
elif result.get('value_type') == 'stream':
|
elif result.get('value_type') == 'stream' and stream_type not in self._UNSUPPORTED_STREAM_TYPES:
|
||||||
claim_id, is_live = result['signing_channel']['claim_id'], True
|
claim_id, is_live = result['signing_channel']['claim_id'], True
|
||||||
live_data = self._download_json(
|
live_data = self._download_json(
|
||||||
'https://api.odysee.live/livestream/is_live', claim_id,
|
'https://api.odysee.live/livestream/is_live', claim_id,
|
||||||
|
@ -1,35 +1,36 @@
|
|||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import parse_age_limit, parse_duration, traverse_obj
|
from ..utils import parse_age_limit, parse_duration, url_or_none
|
||||||
|
from ..utils.traversal import traverse_obj
|
||||||
|
|
||||||
|
|
||||||
class MagellanTVIE(InfoExtractor):
|
class MagellanTVIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?magellantv\.com/(?:watch|video)/(?P<id>[\w-]+)'
|
_VALID_URL = r'https?://(?:www\.)?magellantv\.com/(?:watch|video)/(?P<id>[\w-]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'https://www.magellantv.com/watch/my-dads-on-death-row?type=v',
|
'url': 'https://www.magellantv.com/watch/incas-the-new-story?type=v',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'my-dads-on-death-row',
|
'id': 'incas-the-new-story',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'My Dad\'s On Death Row',
|
'title': 'Incas: The New Story',
|
||||||
'description': 'md5:33ba23b9f0651fc4537ed19b1d5b0d7a',
|
'description': 'md5:936c7f6d711c02dfb9db22a067b586fe',
|
||||||
'duration': 3780.0,
|
|
||||||
'age_limit': 14,
|
'age_limit': 14,
|
||||||
'tags': ['Justice', 'Reality', 'United States', 'True Crime'],
|
'duration': 3060.0,
|
||||||
|
'tags': ['Ancient History', 'Archaeology', 'Anthropology'],
|
||||||
},
|
},
|
||||||
'params': {'skip_download': 'm3u8'},
|
'params': {'skip_download': 'm3u8'},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.magellantv.com/video/james-bulger-the-new-revelations',
|
'url': 'https://www.magellantv.com/video/tortured-to-death-murdering-the-nanny',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'james-bulger-the-new-revelations',
|
'id': 'tortured-to-death-murdering-the-nanny',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'James Bulger: The New Revelations',
|
'title': 'Tortured to Death: Murdering the Nanny',
|
||||||
'description': 'md5:7b97922038bad1d0fe8d0470d8a189f2',
|
'description': 'md5:d87033594fa218af2b1a8b49f52511e5',
|
||||||
|
'age_limit': 14,
|
||||||
'duration': 2640.0,
|
'duration': 2640.0,
|
||||||
'age_limit': 0,
|
'tags': ['True Crime', 'Murder'],
|
||||||
'tags': ['Investigation', 'True Crime', 'Justice', 'Europe'],
|
|
||||||
},
|
},
|
||||||
'params': {'skip_download': 'm3u8'},
|
'params': {'skip_download': 'm3u8'},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.magellantv.com/watch/celebration-nation',
|
'url': 'https://www.magellantv.com/watch/celebration-nation?type=s',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'celebration-nation',
|
'id': 'celebration-nation',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
@ -43,10 +44,19 @@ class MagellanTVIE(InfoExtractor):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
data = traverse_obj(self._search_nextjs_data(webpage, video_id), (
|
context = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['reactContext']
|
||||||
'props', 'pageProps', 'reactContext',
|
data = traverse_obj(context, ((('video', 'detail'), ('series', 'currentEpisode')), {dict}, any))
|
||||||
(('video', 'detail'), ('series', 'currentEpisode')), {dict}), get_all=False)
|
|
||||||
formats, subtitles = self._extract_m3u8_formats_and_subtitles(data['jwpVideoUrl'], video_id)
|
formats, subtitles = [], {}
|
||||||
|
for m3u8_url in set(traverse_obj(data, ((('manifests', ..., 'hls'), 'jwp_video_url'), {url_or_none}))):
|
||||||
|
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||||
|
m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
|
||||||
|
formats.extend(fmts)
|
||||||
|
self._merge_subtitles(subs, target=subtitles)
|
||||||
|
if not formats and (error := traverse_obj(context, ('errorDetailPage', 'errorMessage', {str}))):
|
||||||
|
if 'available in your country' in error:
|
||||||
|
self.raise_geo_restricted(msg=error)
|
||||||
|
self.raise_no_formats(f'{self.IE_NAME} said: {error}', expected=True)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
|
@ -4,7 +4,9 @@
|
|||||||
from ..utils import (
|
from ..utils import (
|
||||||
extract_attributes,
|
extract_attributes,
|
||||||
unified_timestamp,
|
unified_timestamp,
|
||||||
|
url_or_none,
|
||||||
)
|
)
|
||||||
|
from ..utils.traversal import traverse_obj
|
||||||
|
|
||||||
|
|
||||||
class N1InfoAssetIE(InfoExtractor):
|
class N1InfoAssetIE(InfoExtractor):
|
||||||
@ -35,9 +37,9 @@ class N1InfoIIE(InfoExtractor):
|
|||||||
IE_NAME = 'N1Info:article'
|
IE_NAME = 'N1Info:article'
|
||||||
_VALID_URL = r'https?://(?:(?:\w+\.)?n1info\.\w+|nova\.rs)/(?:[^/?#]+/){1,2}(?P<id>[^/?#]+)'
|
_VALID_URL = r'https?://(?:(?:\w+\.)?n1info\.\w+|nova\.rs)/(?:[^/?#]+/){1,2}(?P<id>[^/?#]+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
# Youtube embedded
|
# YouTube embedded
|
||||||
'url': 'https://rs.n1info.com/sport-klub/tenis/kako-je-djokovic-propustio-istorijsku-priliku-video/',
|
'url': 'https://rs.n1info.com/sport-klub/tenis/kako-je-djokovic-propustio-istorijsku-priliku-video/',
|
||||||
'md5': '01ddb6646d0fd9c4c7d990aa77fe1c5a',
|
'md5': '987ce6fd72acfecc453281e066b87973',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'L5Hd4hQVUpk',
|
'id': 'L5Hd4hQVUpk',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
@ -45,7 +47,26 @@ class N1InfoIIE(InfoExtractor):
|
|||||||
'title': 'Ozmo i USO21, ep. 13: Novak Đoković – Danil Medvedev | Ključevi Poraza, Budućnost | SPORT KLUB TENIS',
|
'title': 'Ozmo i USO21, ep. 13: Novak Đoković – Danil Medvedev | Ključevi Poraza, Budućnost | SPORT KLUB TENIS',
|
||||||
'description': 'md5:467f330af1effedd2e290f10dc31bb8e',
|
'description': 'md5:467f330af1effedd2e290f10dc31bb8e',
|
||||||
'uploader': 'Sport Klub',
|
'uploader': 'Sport Klub',
|
||||||
'uploader_id': 'sportklub',
|
'uploader_id': '@sportklub',
|
||||||
|
'uploader_url': 'https://www.youtube.com/@sportklub',
|
||||||
|
'channel': 'Sport Klub',
|
||||||
|
'channel_id': 'UChpzBje9Ro6CComXe3BgNaw',
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UChpzBje9Ro6CComXe3BgNaw',
|
||||||
|
'channel_is_verified': True,
|
||||||
|
'channel_follower_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'view_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
'age_limit': 0,
|
||||||
|
'duration': 1049,
|
||||||
|
'thumbnail': 'https://i.ytimg.com/vi/L5Hd4hQVUpk/maxresdefault.jpg',
|
||||||
|
'chapters': 'count:9',
|
||||||
|
'categories': ['Sports'],
|
||||||
|
'tags': 'count:10',
|
||||||
|
'timestamp': 1631522787,
|
||||||
|
'playable_in_embed': True,
|
||||||
|
'availability': 'public',
|
||||||
|
'live_status': 'not_live',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://rs.n1info.com/vesti/djilas-los-plan-za-metro-nece-resiti-nijedan-saobracajni-problem/',
|
'url': 'https://rs.n1info.com/vesti/djilas-los-plan-za-metro-nece-resiti-nijedan-saobracajni-problem/',
|
||||||
@ -55,6 +76,7 @@ class N1InfoIIE(InfoExtractor):
|
|||||||
'title': 'Đilas: Predlog izgradnje metroa besmislen; SNS odbacuje navode',
|
'title': 'Đilas: Predlog izgradnje metroa besmislen; SNS odbacuje navode',
|
||||||
'upload_date': '20210924',
|
'upload_date': '20210924',
|
||||||
'timestamp': 1632481347,
|
'timestamp': 1632481347,
|
||||||
|
'thumbnail': 'http://n1info.rs/wp-content/themes/ucnewsportal-n1/dist/assets/images/placeholder-image-video.jpg',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -67,6 +89,7 @@ class N1InfoIIE(InfoExtractor):
|
|||||||
'title': 'Zadnji dnevi na kopališču Ilirija: “Ilirija ni umrla, ubili so jo”',
|
'title': 'Zadnji dnevi na kopališču Ilirija: “Ilirija ni umrla, ubili so jo”',
|
||||||
'timestamp': 1632567630,
|
'timestamp': 1632567630,
|
||||||
'upload_date': '20210925',
|
'upload_date': '20210925',
|
||||||
|
'thumbnail': 'https://n1info.si/wp-content/uploads/2021/09/06/1630945843-tomaz3.png',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -81,6 +104,14 @@ class N1InfoIIE(InfoExtractor):
|
|||||||
'upload_date': '20210924',
|
'upload_date': '20210924',
|
||||||
'timestamp': 1632448649.0,
|
'timestamp': 1632448649.0,
|
||||||
'uploader': 'YouLotWhatDontStop',
|
'uploader': 'YouLotWhatDontStop',
|
||||||
|
'display_id': 'pu9wbx',
|
||||||
|
'channel_id': 'serbia',
|
||||||
|
'comment_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
'dislike_count': int,
|
||||||
|
'age_limit': 0,
|
||||||
|
'duration': 134,
|
||||||
|
'thumbnail': 'https://external-preview.redd.it/5nmmawSeGx60miQM3Iq-ueC9oyCLTLjjqX-qqY8uRsc.png?format=pjpg&auto=webp&s=2f973400b04d23f871b608b178e47fc01f9b8f1d',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -93,6 +124,7 @@ class N1InfoIIE(InfoExtractor):
|
|||||||
'title': 'Žaklina Tatalović Ani Brnabić: Pričate laži (VIDEO)',
|
'title': 'Žaklina Tatalović Ani Brnabić: Pričate laži (VIDEO)',
|
||||||
'upload_date': '20211102',
|
'upload_date': '20211102',
|
||||||
'timestamp': 1635861677,
|
'timestamp': 1635861677,
|
||||||
|
'thumbnail': 'https://nova.rs/wp-content/uploads/2021/11/02/1635860298-TNJG_Ana_Brnabic_i_Zaklina_Tatalovic_100_dana_Vlade_GP.jpg',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://n1info.rs/vesti/cuta-biti-u-kosovskoj-mitrovici-znaci-da-te-docekaju-eksplozivnim-napravama/',
|
'url': 'https://n1info.rs/vesti/cuta-biti-u-kosovskoj-mitrovici-znaci-da-te-docekaju-eksplozivnim-napravama/',
|
||||||
@ -104,6 +136,16 @@ class N1InfoIIE(InfoExtractor):
|
|||||||
'timestamp': 1687290536,
|
'timestamp': 1687290536,
|
||||||
'thumbnail': 'https://cdn.brid.tv/live/partners/26827/snapshot/1332368_th_6492013a8356f_1687290170.jpg',
|
'thumbnail': 'https://cdn.brid.tv/live/partners/26827/snapshot/1332368_th_6492013a8356f_1687290170.jpg',
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://n1info.rs/vesti/vuciceva-turneja-po-srbiji-najavljuje-kontrarevoluciju-preti-svom-narodu-vredja-novinare/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '2025974',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Vučićeva turneja po Srbiji: Najavljuje kontrarevoluciju, preti svom narodu, vređa novinare',
|
||||||
|
'thumbnail': 'https://cdn-uc.brid.tv/live/partners/26827/snapshot/2025974_fhd_67c4a23280a81_1740939826.jpg',
|
||||||
|
'timestamp': 1740939936,
|
||||||
|
'upload_date': '20250302',
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://hr.n1info.com/vijesti/pravobraniteljica-o-ubojstvu-u-zagrebu-radi-se-o-doista-nezapamcenoj-situaciji/',
|
'url': 'https://hr.n1info.com/vijesti/pravobraniteljica-o-ubojstvu-u-zagrebu-radi-se-o-doista-nezapamcenoj-situaciji/',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -115,11 +157,11 @@ def _real_extract(self, url):
|
|||||||
|
|
||||||
title = self._html_search_regex(r'<h1[^>]+>(.+?)</h1>', webpage, 'title')
|
title = self._html_search_regex(r'<h1[^>]+>(.+?)</h1>', webpage, 'title')
|
||||||
timestamp = unified_timestamp(self._html_search_meta('article:published_time', webpage))
|
timestamp = unified_timestamp(self._html_search_meta('article:published_time', webpage))
|
||||||
plugin_data = self._html_search_meta('BridPlugin', webpage)
|
plugin_data = re.findall(r'\$bp\("(?:Brid|TargetVideo)_\d+",\s(.+)\);', webpage)
|
||||||
entries = []
|
entries = []
|
||||||
if plugin_data:
|
if plugin_data:
|
||||||
site_id = self._html_search_regex(r'site:(\d+)', webpage, 'site id')
|
site_id = self._html_search_regex(r'site:(\d+)', webpage, 'site id')
|
||||||
for video_data in re.findall(r'\$bp\("Brid_\d+", (.+)\);', webpage):
|
for video_data in plugin_data:
|
||||||
video_id = self._parse_json(video_data, title)['video']
|
video_id = self._parse_json(video_data, title)['video']
|
||||||
entries.append({
|
entries.append({
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
@ -140,7 +182,7 @@ def _real_extract(self, url):
|
|||||||
'url': video_data.get('data-url'),
|
'url': video_data.get('data-url'),
|
||||||
'id': video_data.get('id'),
|
'id': video_data.get('id'),
|
||||||
'title': title,
|
'title': title,
|
||||||
'thumbnail': video_data.get('data-thumbnail'),
|
'thumbnail': traverse_obj(video_data, (('data-thumbnail', 'data-default_thumbnail'), {url_or_none}, any)),
|
||||||
'timestamp': timestamp,
|
'timestamp': timestamp,
|
||||||
'ie_key': 'N1InfoAsset',
|
'ie_key': 'N1InfoAsset',
|
||||||
})
|
})
|
||||||
@ -152,7 +194,7 @@ def _real_extract(self, url):
|
|||||||
if url.startswith('https://www.youtube.com'):
|
if url.startswith('https://www.youtube.com'):
|
||||||
entries.append(self.url_result(url, ie='Youtube'))
|
entries.append(self.url_result(url, ie='Youtube'))
|
||||||
elif url.startswith('https://www.redditmedia.com'):
|
elif url.startswith('https://www.redditmedia.com'):
|
||||||
entries.append(self.url_result(url, ie='RedditR'))
|
entries.append(self.url_result(url, ie='Reddit'))
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'_type': 'playlist',
|
'_type': 'playlist',
|
||||||
|
@ -13,11 +13,13 @@
|
|||||||
ExtractorError,
|
ExtractorError,
|
||||||
OnDemandPagedList,
|
OnDemandPagedList,
|
||||||
clean_html,
|
clean_html,
|
||||||
|
determine_ext,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
join_nonempty,
|
join_nonempty,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
parse_iso8601,
|
parse_iso8601,
|
||||||
|
parse_qs,
|
||||||
parse_resolution,
|
parse_resolution,
|
||||||
qualities,
|
qualities,
|
||||||
remove_start,
|
remove_start,
|
||||||
@ -26,6 +28,7 @@
|
|||||||
try_get,
|
try_get,
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
update_url_query,
|
update_url_query,
|
||||||
|
url_basename,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
urlencode_postdata,
|
urlencode_postdata,
|
||||||
urljoin,
|
urljoin,
|
||||||
@ -430,6 +433,7 @@ def _yield_dms_formats(self, api_data, video_id):
|
|||||||
'format_id': ('id', {str}),
|
'format_id': ('id', {str}),
|
||||||
'abr': ('bitRate', {float_or_none(scale=1000)}),
|
'abr': ('bitRate', {float_or_none(scale=1000)}),
|
||||||
'asr': ('samplingRate', {int_or_none}),
|
'asr': ('samplingRate', {int_or_none}),
|
||||||
|
'quality': ('qualityLevel', {int_or_none}),
|
||||||
}), get_all=False),
|
}), get_all=False),
|
||||||
'acodec': 'aac',
|
'acodec': 'aac',
|
||||||
}
|
}
|
||||||
@ -441,7 +445,9 @@ def _yield_dms_formats(self, api_data, video_id):
|
|||||||
min_abr = min(traverse_obj(audios, (..., 'bitRate', {float_or_none})), default=0) / 1000
|
min_abr = min(traverse_obj(audios, (..., 'bitRate', {float_or_none})), default=0) / 1000
|
||||||
for video_fmt in video_fmts:
|
for video_fmt in video_fmts:
|
||||||
video_fmt['tbr'] -= min_abr
|
video_fmt['tbr'] -= min_abr
|
||||||
video_fmt['format_id'] = f'video-{video_fmt["tbr"]:.0f}'
|
video_fmt['format_id'] = url_basename(video_fmt['url']).rpartition('.')[0]
|
||||||
|
video_fmt['quality'] = traverse_obj(videos, (
|
||||||
|
lambda _, v: v['id'] == video_fmt['format_id'], 'qualityLevel', {int_or_none}, any)) or -1
|
||||||
yield video_fmt
|
yield video_fmt
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -1033,6 +1039,7 @@ def _real_extract(self, url):
|
|||||||
thumbnails.append({
|
thumbnails.append({
|
||||||
'id': f'{name}_{width}x{height}',
|
'id': f'{name}_{width}x{height}',
|
||||||
'url': img_url,
|
'url': img_url,
|
||||||
|
'ext': traverse_obj(parse_qs(img_url), ('image', 0, {determine_ext(default_ext='jpg')})),
|
||||||
**res,
|
**res,
|
||||||
})
|
})
|
||||||
|
|
||||||
|
@ -501,7 +501,7 @@ def _extract_webpage(self, url):
|
|||||||
r"div\s*:\s*'videoembed'\s*,\s*mediaid\s*:\s*'(\d+)'", # frontline video embed
|
r"div\s*:\s*'videoembed'\s*,\s*mediaid\s*:\s*'(\d+)'", # frontline video embed
|
||||||
r'class="coveplayerid">([^<]+)<', # coveplayer
|
r'class="coveplayerid">([^<]+)<', # coveplayer
|
||||||
r'<section[^>]+data-coveid="(\d+)"', # coveplayer from http://www.pbs.org/wgbh/frontline/film/real-csi/
|
r'<section[^>]+data-coveid="(\d+)"', # coveplayer from http://www.pbs.org/wgbh/frontline/film/real-csi/
|
||||||
r'\bclass="passportcoveplayer"[^>]+\bdata-media="(\d+)', # https://www.thirteen.org/programs/the-woodwrights-shop/who-wrote-the-book-of-sloyd-fggvvq/
|
r'\sclass="passportcoveplayer"[^>]*\sdata-media="(\d+)', # https://www.thirteen.org/programs/the-woodwrights-shop/who-wrote-the-book-of-sloyd-fggvvq/
|
||||||
r'<input type="hidden" id="pbs_video_id_[0-9]+" value="([0-9]+)"/>', # jwplayer
|
r'<input type="hidden" id="pbs_video_id_[0-9]+" value="([0-9]+)"/>', # jwplayer
|
||||||
r"(?s)window\.PBS\.playerConfig\s*=\s*{.*?id\s*:\s*'([0-9]+)',",
|
r"(?s)window\.PBS\.playerConfig\s*=\s*{.*?id\s*:\s*'([0-9]+)',",
|
||||||
r'<div[^>]+\bdata-cove-id=["\'](\d+)"', # http://www.pbs.org/wgbh/roadshow/watch/episode/2105-indianapolis-hour-2/
|
r'<div[^>]+\bdata-cove-id=["\'](\d+)"', # http://www.pbs.org/wgbh/roadshow/watch/episode/2105-indianapolis-hour-2/
|
||||||
|
@ -23,9 +23,9 @@ class PinterestBaseIE(InfoExtractor):
|
|||||||
def _call_api(self, resource, video_id, options):
|
def _call_api(self, resource, video_id, options):
|
||||||
return self._download_json(
|
return self._download_json(
|
||||||
f'https://www.pinterest.com/resource/{resource}Resource/get/',
|
f'https://www.pinterest.com/resource/{resource}Resource/get/',
|
||||||
video_id, f'Download {resource} JSON metadata', query={
|
video_id, f'Download {resource} JSON metadata',
|
||||||
'data': json.dumps({'options': options}),
|
query={'data': json.dumps({'options': options})},
|
||||||
})['resource_response']
|
headers={'X-Pinterest-PWS-Handler': 'www/[username].js'})['resource_response']
|
||||||
|
|
||||||
def _extract_video(self, data, extract_formats=True):
|
def _extract_video(self, data, extract_formats=True):
|
||||||
video_id = data['id']
|
video_id = data['id']
|
||||||
|
@ -1,4 +1,7 @@
|
|||||||
|
import base64
|
||||||
|
import hashlib
|
||||||
import json
|
import json
|
||||||
|
import uuid
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
@ -142,39 +145,73 @@ class PlaySuisseIE(InfoExtractor):
|
|||||||
id
|
id
|
||||||
url
|
url
|
||||||
}'''
|
}'''
|
||||||
_LOGIN_BASE_URL = 'https://login.srgssr.ch/srgssrlogin.onmicrosoft.com'
|
_CLIENT_ID = '1e33f1bf-8bf3-45e4-bbd9-c9ad934b5fca'
|
||||||
_LOGIN_PATH = 'B2C_1A__SignInV2'
|
_LOGIN_BASE = 'https://account.srgssr.ch'
|
||||||
_ID_TOKEN = None
|
_ID_TOKEN = None
|
||||||
|
|
||||||
def _perform_login(self, username, password):
|
def _perform_login(self, username, password):
|
||||||
login_page = self._download_webpage(
|
code_verifier = uuid.uuid4().hex + uuid.uuid4().hex + uuid.uuid4().hex
|
||||||
'https://www.playsuisse.ch/api/sso/login', None, note='Downloading login page',
|
code_challenge = base64.urlsafe_b64encode(
|
||||||
query={'x': 'x', 'locale': 'de', 'redirectUrl': 'https://www.playsuisse.ch/'})
|
hashlib.sha256(code_verifier.encode()).digest()).decode().rstrip('=')
|
||||||
settings = self._search_json(r'var\s+SETTINGS\s*=', login_page, 'settings', None)
|
|
||||||
|
|
||||||
csrf_token = settings['csrf']
|
request_id = parse_qs(self._request_webpage(
|
||||||
query = {'tx': settings['transId'], 'p': self._LOGIN_PATH}
|
f'{self._LOGIN_BASE}/authz-srv/authz', None, 'Requesting session ID', query={
|
||||||
|
'client_id': self._CLIENT_ID,
|
||||||
|
'redirect_uri': 'https://www.playsuisse.ch/auth',
|
||||||
|
'scope': 'email profile openid offline_access',
|
||||||
|
'response_type': 'code',
|
||||||
|
'code_challenge': code_challenge,
|
||||||
|
'code_challenge_method': 'S256',
|
||||||
|
'view_type': 'login',
|
||||||
|
}).url)['requestId'][0]
|
||||||
|
|
||||||
status = traverse_obj(self._download_json(
|
try:
|
||||||
f'{self._LOGIN_BASE_URL}/{self._LOGIN_PATH}/SelfAsserted', None, 'Logging in',
|
exchange_id = self._download_json(
|
||||||
query=query, headers={'X-CSRF-TOKEN': csrf_token}, data=urlencode_postdata({
|
f'{self._LOGIN_BASE}/verification-srv/v2/authenticate/initiate/password', None,
|
||||||
'request_type': 'RESPONSE',
|
'Submitting username', headers={'content-type': 'application/json'}, data=json.dumps({
|
||||||
'signInName': username,
|
'usage_type': 'INITIAL_AUTHENTICATION',
|
||||||
|
'request_id': request_id,
|
||||||
|
'medium_id': 'PASSWORD',
|
||||||
|
'type': 'password',
|
||||||
|
'identifier': username,
|
||||||
|
}).encode())['data']['exchange_id']['exchange_id']
|
||||||
|
except ExtractorError:
|
||||||
|
raise ExtractorError('Invalid username', expected=True)
|
||||||
|
|
||||||
|
try:
|
||||||
|
login_data = self._download_json(
|
||||||
|
f'{self._LOGIN_BASE}/verification-srv/v2/authenticate/authenticate/password', None,
|
||||||
|
'Submitting password', headers={'content-type': 'application/json'}, data=json.dumps({
|
||||||
|
'requestId': request_id,
|
||||||
|
'exchange_id': exchange_id,
|
||||||
|
'type': 'password',
|
||||||
'password': password,
|
'password': password,
|
||||||
}), expected_status=400), ('status', {int_or_none}))
|
}).encode())['data']
|
||||||
if status == 400:
|
except ExtractorError:
|
||||||
raise ExtractorError('Invalid username or password', expected=True)
|
raise ExtractorError('Invalid password', expected=True)
|
||||||
|
|
||||||
urlh = self._request_webpage(
|
authorization_code = parse_qs(self._request_webpage(
|
||||||
f'{self._LOGIN_BASE_URL}/{self._LOGIN_PATH}/api/CombinedSigninAndSignup/confirmed',
|
f'{self._LOGIN_BASE}/login-srv/verification/login', None, 'Logging in',
|
||||||
None, 'Downloading ID token', query={
|
data=urlencode_postdata({
|
||||||
'rememberMe': 'false',
|
'requestId': request_id,
|
||||||
'csrf_token': csrf_token,
|
'exchange_id': login_data['exchange_id']['exchange_id'],
|
||||||
**query,
|
'verificationType': 'password',
|
||||||
'diags': '',
|
'sub': login_data['sub'],
|
||||||
})
|
'status_id': login_data['status_id'],
|
||||||
|
'rememberMe': True,
|
||||||
|
'lat': '',
|
||||||
|
'lon': '',
|
||||||
|
})).url)['code'][0]
|
||||||
|
|
||||||
|
self._ID_TOKEN = self._download_json(
|
||||||
|
f'{self._LOGIN_BASE}/proxy/token', None, 'Downloading token', data=b'', query={
|
||||||
|
'client_id': self._CLIENT_ID,
|
||||||
|
'redirect_uri': 'https://www.playsuisse.ch/auth',
|
||||||
|
'code': authorization_code,
|
||||||
|
'code_verifier': code_verifier,
|
||||||
|
'grant_type': 'authorization_code',
|
||||||
|
})['id_token']
|
||||||
|
|
||||||
self._ID_TOKEN = traverse_obj(parse_qs(urlh.url), ('id_token', 0))
|
|
||||||
if not self._ID_TOKEN:
|
if not self._ID_TOKEN:
|
||||||
raise ExtractorError('Login failed')
|
raise ExtractorError('Login failed')
|
||||||
|
|
||||||
|
@ -198,6 +198,25 @@ class RedditIE(InfoExtractor):
|
|||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
'writesubtitles': True,
|
'writesubtitles': True,
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
# "gated" subreddit post
|
||||||
|
'url': 'https://old.reddit.com/r/ketamine/comments/degtjo/when_the_k_hits/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'gqsbxts133r31',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'display_id': 'degtjo',
|
||||||
|
'title': 'When the K hits',
|
||||||
|
'uploader': '[deleted]',
|
||||||
|
'channel_id': 'ketamine',
|
||||||
|
'comment_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
'dislike_count': int,
|
||||||
|
'age_limit': 18,
|
||||||
|
'duration': 34,
|
||||||
|
'thumbnail': r're:https?://.+/.+\.(?:jpg|png)',
|
||||||
|
'timestamp': 1570438713.0,
|
||||||
|
'upload_date': '20191007',
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.reddit.com/r/videos/comments/6rrwyj',
|
'url': 'https://www.reddit.com/r/videos/comments/6rrwyj',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -245,6 +264,15 @@ def _perform_login(self, username, password):
|
|||||||
elif not traverse_obj(login, ('json', 'data', 'cookie', {str})):
|
elif not traverse_obj(login, ('json', 'data', 'cookie', {str})):
|
||||||
raise ExtractorError('Unable to login, no cookie was returned')
|
raise ExtractorError('Unable to login, no cookie was returned')
|
||||||
|
|
||||||
|
def _real_initialize(self):
|
||||||
|
# Set cookie to opt-in to age-restricted subreddits
|
||||||
|
self._set_cookie('reddit.com', 'over18', '1')
|
||||||
|
# Set cookie to opt-in to "gated" subreddits
|
||||||
|
options = traverse_obj(self._get_cookies('https://www.reddit.com/'), (
|
||||||
|
'_options', 'value', {urllib.parse.unquote}, {json.loads}, {dict})) or {}
|
||||||
|
options['pref_gated_sr_optin'] = True
|
||||||
|
self._set_cookie('reddit.com', '_options', urllib.parse.quote(json.dumps(options)))
|
||||||
|
|
||||||
def _get_subtitles(self, video_id):
|
def _get_subtitles(self, video_id):
|
||||||
# Fallback if there were no subtitles provided by DASH or HLS manifests
|
# Fallback if there were no subtitles provided by DASH or HLS manifests
|
||||||
caption_url = f'https://v.redd.it/{video_id}/wh_ben_en.vtt'
|
caption_url = f'https://v.redd.it/{video_id}/wh_ben_en.vtt'
|
||||||
|
@ -3,12 +3,20 @@
|
|||||||
import re
|
import re
|
||||||
import urllib.parse
|
import urllib.parse
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor, Request
|
||||||
from ..utils import js_to_json
|
from ..utils import (
|
||||||
|
determine_ext,
|
||||||
|
int_or_none,
|
||||||
|
js_to_json,
|
||||||
|
parse_duration,
|
||||||
|
parse_iso8601,
|
||||||
|
url_or_none,
|
||||||
|
)
|
||||||
|
from ..utils.traversal import traverse_obj
|
||||||
|
|
||||||
|
|
||||||
class RTPIE(InfoExtractor):
|
class RTPIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/(?:(?:estudoemcasa|palco|zigzag)/)?p(?P<program_id>[0-9]+)/(?P<id>[^/?#]+)'
|
_VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/(?:[^/#?]+/)?p(?P<program_id>\d+)/(?P<id>e\d+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.rtp.pt/play/p405/e174042/paixoes-cruzadas',
|
'url': 'http://www.rtp.pt/play/p405/e174042/paixoes-cruzadas',
|
||||||
'md5': 'e736ce0c665e459ddb818546220b4ef8',
|
'md5': 'e736ce0c665e459ddb818546220b4ef8',
|
||||||
@ -16,99 +24,173 @@ class RTPIE(InfoExtractor):
|
|||||||
'id': 'e174042',
|
'id': 'e174042',
|
||||||
'ext': 'mp3',
|
'ext': 'mp3',
|
||||||
'title': 'Paixões Cruzadas',
|
'title': 'Paixões Cruzadas',
|
||||||
'description': 'As paixões musicais de António Cartaxo e António Macedo',
|
'description': 'md5:af979e58ba0ab73f78435fc943fdb070',
|
||||||
'thumbnail': r're:^https?://.*\.jpg',
|
'thumbnail': r're:^https?://.*\.jpg',
|
||||||
|
'series': 'Paixões Cruzadas',
|
||||||
|
'duration': 2950.0,
|
||||||
|
'modified_timestamp': 1553693464,
|
||||||
|
'modified_date': '20190327',
|
||||||
|
'timestamp': 1417219200,
|
||||||
|
'upload_date': '20141129',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.rtp.pt/play/zigzag/p13166/e757904/25-curiosidades-25-de-abril',
|
'url': 'https://www.rtp.pt/play/zigzag/p13166/e757904/25-curiosidades-25-de-abril',
|
||||||
'md5': '9a81ed53f2b2197cfa7ed455b12f8ade',
|
'md5': '5b4859940e3adef61247a77dfb76046a',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'e757904',
|
'id': 'e757904',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': '25 Curiosidades, 25 de Abril',
|
'title': 'Estudar ou não estudar',
|
||||||
'description': 'Estudar ou não estudar - Em cada um dos episódios descobrimos uma curiosidade acerca de como era viver em Portugal antes da revolução do 25 de abr',
|
'description': 'md5:3bfd7eb8bebfd5711a08df69c9c14c35',
|
||||||
'thumbnail': r're:^https?://.*\.jpg',
|
'thumbnail': r're:^https?://.*\.jpg',
|
||||||
|
'timestamp': 1711958401,
|
||||||
|
'duration': 146.0,
|
||||||
|
'upload_date': '20240401',
|
||||||
|
'modified_timestamp': 1712242991,
|
||||||
|
'series': '25 Curiosidades, 25 de Abril',
|
||||||
|
'episode_number': 2,
|
||||||
|
'episode': 'Estudar ou não estudar',
|
||||||
|
'modified_date': '20240404',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.rtp.pt/play/p831/a-quimica-das-coisas',
|
# Episode not accessible through API
|
||||||
'only_matching': True,
|
'url': 'https://www.rtp.pt/play/estudoemcasa/p7776/e500050/portugues-1-ano',
|
||||||
}, {
|
'md5': '57660c0b46db9f22118c52cbd65975e4',
|
||||||
'url': 'https://www.rtp.pt/play/estudoemcasa/p7776/portugues-1-ano',
|
'info_dict': {
|
||||||
'only_matching': True,
|
'id': 'e500050',
|
||||||
}, {
|
'ext': 'mp4',
|
||||||
'url': 'https://www.rtp.pt/play/palco/p13785/l7nnon',
|
'title': 'Português - 1.º ano',
|
||||||
'only_matching': True,
|
'duration': 1669.0,
|
||||||
|
'description': 'md5:be68925c81269f8c6886589f25fe83ea',
|
||||||
|
'upload_date': '20201020',
|
||||||
|
'timestamp': 1603180799,
|
||||||
|
'thumbnail': 'https://cdn-images.rtp.pt/EPG/imagens/39482_59449_64850.png?v=3&w=860',
|
||||||
|
},
|
||||||
}]
|
}]
|
||||||
|
|
||||||
|
_USER_AGENT = 'rtpplay/2.0.66 (pt.rtp.rtpplay; build:2066; iOS 15.8.3) Alamofire/5.9.1'
|
||||||
|
_AUTH_TOKEN = None
|
||||||
|
|
||||||
|
def _fetch_auth_token(self):
|
||||||
|
if self._AUTH_TOKEN:
|
||||||
|
return self._AUTH_TOKEN
|
||||||
|
self._AUTH_TOKEN = traverse_obj(self._download_json(Request(
|
||||||
|
'https://rtpplayapi.rtp.pt/play/api/2/token-manager',
|
||||||
|
headers={
|
||||||
|
'Accept': '*/*',
|
||||||
|
'rtp-play-auth': 'RTPPLAY_MOBILE_IOS',
|
||||||
|
'rtp-play-auth-hash': 'fac9c328b2f27e26e03d7f8942d66c05b3e59371e16c2a079f5c83cc801bd3ee',
|
||||||
|
'rtp-play-auth-timestamp': '2145973229682',
|
||||||
|
'User-Agent': self._USER_AGENT,
|
||||||
|
}, extensions={'keep_header_casing': True}), None,
|
||||||
|
note='Fetching guest auth token', errnote='Could not fetch guest auth token',
|
||||||
|
fatal=False), ('token', 'token', {str}))
|
||||||
|
return self._AUTH_TOKEN
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _cleanup_media_url(url):
|
||||||
|
if urllib.parse.urlparse(url).netloc == 'streaming-ondemand.rtp.pt':
|
||||||
|
return None
|
||||||
|
return url.replace('/drm-fps/', '/hls/').replace('/drm-dash/', '/dash/')
|
||||||
|
|
||||||
|
def _extract_formats(self, media_urls, episode_id):
|
||||||
|
formats = []
|
||||||
|
subtitles = {}
|
||||||
|
for media_url in set(traverse_obj(media_urls, (..., {url_or_none}, {self._cleanup_media_url}))):
|
||||||
|
ext = determine_ext(media_url)
|
||||||
|
if ext == 'm3u8':
|
||||||
|
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||||
|
media_url, episode_id, m3u8_id='hls', fatal=False)
|
||||||
|
formats.extend(fmts)
|
||||||
|
self._merge_subtitles(subs, target=subtitles)
|
||||||
|
elif ext == 'mpd':
|
||||||
|
fmts, subs = self._extract_mpd_formats_and_subtitles(
|
||||||
|
media_url, episode_id, mpd_id='dash', fatal=False)
|
||||||
|
formats.extend(fmts)
|
||||||
|
self._merge_subtitles(subs, target=subtitles)
|
||||||
|
else:
|
||||||
|
formats.append({
|
||||||
|
'url': media_url,
|
||||||
|
'format_id': 'http',
|
||||||
|
})
|
||||||
|
return formats, subtitles
|
||||||
|
|
||||||
|
def _extract_from_api(self, program_id, episode_id):
|
||||||
|
auth_token = self._fetch_auth_token()
|
||||||
|
if not auth_token:
|
||||||
|
return
|
||||||
|
episode_data = traverse_obj(self._download_json(
|
||||||
|
f'https://www.rtp.pt/play/api/1/get-episode/{program_id}/{episode_id[1:]}', episode_id,
|
||||||
|
query={'include_assets': 'true', 'include_webparams': 'true'},
|
||||||
|
headers={
|
||||||
|
'Accept': '*/*',
|
||||||
|
'Authorization': f'Bearer {auth_token}',
|
||||||
|
'User-Agent': self._USER_AGENT,
|
||||||
|
}, fatal=False), 'result', {dict})
|
||||||
|
if not episode_data:
|
||||||
|
return
|
||||||
|
asset_urls = traverse_obj(episode_data, ('assets', 0, 'asset_url', {dict}))
|
||||||
|
media_urls = traverse_obj(asset_urls, (
|
||||||
|
((('hls', 'dash'), 'stream_url'), ('multibitrate', ('url_hls', 'url_dash'))),))
|
||||||
|
formats, subtitles = self._extract_formats(media_urls, episode_id)
|
||||||
|
|
||||||
|
for sub_data in traverse_obj(asset_urls, ('subtitles', 'vtt_list', lambda _, v: url_or_none(v['file']))):
|
||||||
|
subtitles.setdefault(sub_data.get('code') or 'pt', []).append({
|
||||||
|
'url': sub_data['file'],
|
||||||
|
'name': sub_data.get('language'),
|
||||||
|
})
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': episode_id,
|
||||||
|
'formats': formats,
|
||||||
|
'subtitles': subtitles,
|
||||||
|
'thumbnail': traverse_obj(episode_data, ('assets', 0, 'asset_thumbnail', {url_or_none})),
|
||||||
|
**traverse_obj(episode_data, ('episode', {
|
||||||
|
'title': (('episode_title', 'program_title'), {str}, filter, any),
|
||||||
|
'alt_title': ('episode_subtitle', {str}, filter),
|
||||||
|
'description': (('episode_description', 'episode_summary'), {str}, filter, any),
|
||||||
|
'timestamp': ('episode_air_date', {parse_iso8601(delimiter=' ')}),
|
||||||
|
'modified_timestamp': ('episode_lastchanged', {parse_iso8601(delimiter=' ')}),
|
||||||
|
'duration': ('episode_duration_complete', {parse_duration}),
|
||||||
|
'episode': ('episode_title', {str}, filter),
|
||||||
|
'episode_number': ('episode_number', {int_or_none}),
|
||||||
|
'season': ('program_season', {str}, filter),
|
||||||
|
'series': ('program_title', {str}, filter),
|
||||||
|
})),
|
||||||
|
}
|
||||||
|
|
||||||
_RX_OBFUSCATION = re.compile(r'''(?xs)
|
_RX_OBFUSCATION = re.compile(r'''(?xs)
|
||||||
atob\s*\(\s*decodeURIComponent\s*\(\s*
|
atob\s*\(\s*decodeURIComponent\s*\(\s*
|
||||||
(\[[0-9A-Za-z%,'"]*\])
|
(\[[0-9A-Za-z%,'"]*\])
|
||||||
\s*\.\s*join\(\s*(?:""|'')\s*\)\s*\)\s*\)
|
\s*\.\s*join\(\s*(?:""|'')\s*\)\s*\)\s*\)
|
||||||
''')
|
''')
|
||||||
|
|
||||||
def __unobfuscate(self, data, *, video_id):
|
def __unobfuscate(self, data):
|
||||||
if data.startswith('{'):
|
return self._RX_OBFUSCATION.sub(
|
||||||
data = self._RX_OBFUSCATION.sub(
|
|
||||||
lambda m: json.dumps(
|
lambda m: json.dumps(
|
||||||
base64.b64decode(urllib.parse.unquote(
|
base64.b64decode(urllib.parse.unquote(
|
||||||
''.join(self._parse_json(m.group(1), video_id)),
|
''.join(json.loads(m.group(1))),
|
||||||
)).decode('iso-8859-1')),
|
)).decode('iso-8859-1')),
|
||||||
data)
|
data)
|
||||||
return js_to_json(data)
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _extract_from_html(self, url, episode_id):
|
||||||
video_id = self._match_id(url)
|
webpage = self._download_webpage(url, episode_id)
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
title = self._html_search_meta(
|
|
||||||
'twitter:title', webpage, display_name='title', fatal=True)
|
|
||||||
|
|
||||||
f, config = self._search_regex(
|
|
||||||
r'''(?sx)
|
|
||||||
(?:var\s+f\s*=\s*(?P<f>".*?"|{[^;]+?});\s*)?
|
|
||||||
var\s+player1\s+=\s+new\s+RTPPlayer\s*\((?P<config>{(?:(?!\*/).)+?})\);(?!\s*\*/)
|
|
||||||
''', webpage,
|
|
||||||
'player config', group=('f', 'config'))
|
|
||||||
|
|
||||||
config = self._parse_json(
|
|
||||||
config, video_id,
|
|
||||||
lambda data: self.__unobfuscate(data, video_id=video_id))
|
|
||||||
f = config['file'] if not f else self._parse_json(
|
|
||||||
f, video_id,
|
|
||||||
lambda data: self.__unobfuscate(data, video_id=video_id))
|
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
if isinstance(f, dict):
|
|
||||||
f_hls = f.get('hls')
|
|
||||||
if f_hls is not None:
|
|
||||||
formats.extend(self._extract_m3u8_formats(
|
|
||||||
f_hls, video_id, 'mp4', 'm3u8_native', m3u8_id='hls'))
|
|
||||||
|
|
||||||
f_dash = f.get('dash')
|
|
||||||
if f_dash is not None:
|
|
||||||
formats.extend(self._extract_mpd_formats(f_dash, video_id, mpd_id='dash'))
|
|
||||||
else:
|
|
||||||
formats.append({
|
|
||||||
'format_id': 'f',
|
|
||||||
'url': f,
|
|
||||||
'vcodec': 'none' if config.get('mediaType') == 'audio' else None,
|
|
||||||
})
|
|
||||||
|
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
|
media_urls = traverse_obj(re.findall(r'(?:var\s+f\s*=|RTPPlayer\({[^}]+file:)\s*({[^}]+}|"[^"]+")', webpage), (
|
||||||
vtt = config.get('vtt')
|
-1, (({self.__unobfuscate}, {js_to_json}, {json.loads}, {dict.values}, ...), {json.loads})))
|
||||||
if vtt is not None:
|
formats, subtitles = self._extract_formats(media_urls, episode_id)
|
||||||
for lcode, lname, url in vtt:
|
|
||||||
subtitles.setdefault(lcode, []).append({
|
|
||||||
'name': lname,
|
|
||||||
'url': url,
|
|
||||||
})
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': episode_id,
|
||||||
'title': title,
|
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'description': self._html_search_meta(['description', 'twitter:description'], webpage),
|
|
||||||
'thumbnail': config.get('poster') or self._og_search_thumbnail(webpage),
|
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
|
'description': self._html_search_meta(['og:description', 'twitter:description'], webpage, default=None),
|
||||||
|
'thumbnail': self._html_search_meta(['og:image', 'twitter:image'], webpage, default=None),
|
||||||
|
**self._search_json_ld(webpage, episode_id, default={}),
|
||||||
|
'title': self._html_search_meta(['og:title', 'twitter:title'], webpage, default=None),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
program_id, episode_id = self._match_valid_url(url).group('program_id', 'id')
|
||||||
|
return self._extract_from_api(program_id, episode_id) or self._extract_from_html(url, episode_id)
|
||||||
|
87
yt_dlp/extractor/softwhiteunderbelly.py
Normal file
87
yt_dlp/extractor/softwhiteunderbelly.py
Normal file
@ -0,0 +1,87 @@
|
|||||||
|
from .common import InfoExtractor
|
||||||
|
from .vimeo import VHXEmbedIE
|
||||||
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
|
clean_html,
|
||||||
|
update_url,
|
||||||
|
urlencode_postdata,
|
||||||
|
)
|
||||||
|
from ..utils.traversal import find_element, traverse_obj
|
||||||
|
|
||||||
|
|
||||||
|
class SoftWhiteUnderbellyIE(InfoExtractor):
|
||||||
|
_LOGIN_URL = 'https://www.softwhiteunderbelly.com/login'
|
||||||
|
_NETRC_MACHINE = 'softwhiteunderbelly'
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?softwhiteunderbelly\.com/videos/(?P<id>[\w-]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.softwhiteunderbelly.com/videos/kenneth-final1',
|
||||||
|
'note': 'A single Soft White Underbelly Episode',
|
||||||
|
'md5': '8e79f29ec1f1bda6da2e0b998fcbebb8',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3201266',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'display_id': 'kenneth-final1',
|
||||||
|
'title': 'Appalachian Man interview-Kenneth',
|
||||||
|
'description': 'Soft White Underbelly interview and portrait of Kenneth, an Appalachian man in Clay County, Kentucky.',
|
||||||
|
'thumbnail': 'https://vhx.imgix.net/softwhiteunderbelly/assets/249f6db0-2b39-49a4-979b-f8dad4681825.jpg',
|
||||||
|
'uploader_url': 'https://vimeo.com/user80538407',
|
||||||
|
'uploader': 'OTT Videos',
|
||||||
|
'uploader_id': 'user80538407',
|
||||||
|
'duration': 512,
|
||||||
|
},
|
||||||
|
'expected_warnings': ['Failed to parse XML: not well-formed'],
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.softwhiteunderbelly.com/videos/tj-2-final-2160p',
|
||||||
|
'note': 'A single Soft White Underbelly Episode',
|
||||||
|
'md5': '286bd8851b4824c62afb369e6f307036',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3506029',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'display_id': 'tj-2-final-2160p',
|
||||||
|
'title': 'Fentanyl Addict interview-TJ (follow up)',
|
||||||
|
'description': 'Soft White Underbelly follow up interview and portrait of TJ, a fentanyl addict on Skid Row.',
|
||||||
|
'thumbnail': 'https://vhx.imgix.net/softwhiteunderbelly/assets/c883d531-5da0-4faf-a2e2-8eba97e5adfc.jpg',
|
||||||
|
'duration': 817,
|
||||||
|
'uploader': 'OTT Videos',
|
||||||
|
'uploader_url': 'https://vimeo.com/user80538407',
|
||||||
|
'uploader_id': 'user80538407',
|
||||||
|
},
|
||||||
|
'expected_warnings': ['Failed to parse XML: not well-formed'],
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _perform_login(self, username, password):
|
||||||
|
signin_page = self._download_webpage(self._LOGIN_URL, None, 'Fetching authenticity token')
|
||||||
|
self._download_webpage(
|
||||||
|
self._LOGIN_URL, None, 'Logging in',
|
||||||
|
data=urlencode_postdata({
|
||||||
|
'email': username,
|
||||||
|
'password': password,
|
||||||
|
'authenticity_token': self._html_search_regex(
|
||||||
|
r'name=["\']authenticity_token["\']\s+value=["\']([^"\']+)', signin_page, 'authenticity_token'),
|
||||||
|
'utf8': True,
|
||||||
|
}),
|
||||||
|
)
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
if '<div id="watch-unauthorized"' in webpage:
|
||||||
|
if self._get_cookies('https://www.softwhiteunderbelly.com').get('_session'):
|
||||||
|
raise ExtractorError('This account is not subscribed to this content', expected=True)
|
||||||
|
self.raise_login_required()
|
||||||
|
|
||||||
|
embed_url, embed_id = self._html_search_regex(
|
||||||
|
r'embed_url:\s*["\'](?P<url>https?://embed\.vhx\.tv/videos/(?P<id>\d+)[^"\']*)',
|
||||||
|
webpage, 'embed url', group=('url', 'id'))
|
||||||
|
|
||||||
|
return {
|
||||||
|
'_type': 'url_transparent',
|
||||||
|
'ie_key': VHXEmbedIE.ie_key(),
|
||||||
|
'url': VHXEmbedIE._smuggle_referrer(embed_url, 'https://www.softwhiteunderbelly.com'),
|
||||||
|
'id': embed_id,
|
||||||
|
'display_id': display_id,
|
||||||
|
'title': traverse_obj(webpage, ({find_element(id='watch-info')}, {find_element(cls='video-title')}, {clean_html})),
|
||||||
|
'description': self._html_search_meta('description', webpage, default=None),
|
||||||
|
'thumbnail': update_url(self._og_search_thumbnail(webpage) or '', query=None) or None,
|
||||||
|
}
|
@ -52,7 +52,8 @@ class SoundcloudBaseIE(InfoExtractor):
|
|||||||
_API_VERIFY_AUTH_TOKEN = 'https://api-auth.soundcloud.com/connect/session%s'
|
_API_VERIFY_AUTH_TOKEN = 'https://api-auth.soundcloud.com/connect/session%s'
|
||||||
_HEADERS = {}
|
_HEADERS = {}
|
||||||
|
|
||||||
_IMAGE_REPL_RE = r'-([0-9a-z]+)\.jpg'
|
_IMAGE_REPL_RE = r'-[0-9a-z]+\.(?P<ext>jpg|png)'
|
||||||
|
_TAGS_RE = re.compile(r'"([^"]+)"|([^ ]+)')
|
||||||
|
|
||||||
_ARTWORK_MAP = {
|
_ARTWORK_MAP = {
|
||||||
'mini': 16,
|
'mini': 16,
|
||||||
@ -331,12 +332,14 @@ def invalid_url(url):
|
|||||||
thumbnails = []
|
thumbnails = []
|
||||||
artwork_url = info.get('artwork_url')
|
artwork_url = info.get('artwork_url')
|
||||||
thumbnail = artwork_url or user.get('avatar_url')
|
thumbnail = artwork_url or user.get('avatar_url')
|
||||||
if isinstance(thumbnail, str):
|
if url_or_none(thumbnail):
|
||||||
if re.search(self._IMAGE_REPL_RE, thumbnail):
|
if mobj := re.search(self._IMAGE_REPL_RE, thumbnail):
|
||||||
for image_id, size in self._ARTWORK_MAP.items():
|
for image_id, size in self._ARTWORK_MAP.items():
|
||||||
|
# Soundcloud serves JPEG regardless of URL's ext *except* for "original" thumb
|
||||||
|
ext = mobj.group('ext') if image_id == 'original' else 'jpg'
|
||||||
i = {
|
i = {
|
||||||
'id': image_id,
|
'id': image_id,
|
||||||
'url': re.sub(self._IMAGE_REPL_RE, f'-{image_id}.jpg', thumbnail),
|
'url': re.sub(self._IMAGE_REPL_RE, f'-{image_id}.{ext}', thumbnail),
|
||||||
}
|
}
|
||||||
if image_id == 'tiny' and not artwork_url:
|
if image_id == 'tiny' and not artwork_url:
|
||||||
size = 18
|
size = 18
|
||||||
@ -372,6 +375,7 @@ def extract_count(key):
|
|||||||
'comment_count': extract_count('comment'),
|
'comment_count': extract_count('comment'),
|
||||||
'repost_count': extract_count('reposts'),
|
'repost_count': extract_count('reposts'),
|
||||||
'genres': traverse_obj(info, ('genre', {str}, filter, all, filter)),
|
'genres': traverse_obj(info, ('genre', {str}, filter, all, filter)),
|
||||||
|
'tags': traverse_obj(info, ('tag_list', {self._TAGS_RE.findall}, ..., ..., filter)),
|
||||||
'artists': traverse_obj(info, ('publisher_metadata', 'artist', {str}, filter, all, filter)),
|
'artists': traverse_obj(info, ('publisher_metadata', 'artist', {str}, filter, all, filter)),
|
||||||
'formats': formats if not extract_flat else None,
|
'formats': formats if not extract_flat else None,
|
||||||
}
|
}
|
||||||
@ -425,6 +429,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
|||||||
'repost_count': int,
|
'repost_count': int,
|
||||||
'thumbnail': 'https://i1.sndcdn.com/artworks-000031955188-rwb18x-original.jpg',
|
'thumbnail': 'https://i1.sndcdn.com/artworks-000031955188-rwb18x-original.jpg',
|
||||||
'uploader_url': 'https://soundcloud.com/ethmusic',
|
'uploader_url': 'https://soundcloud.com/ethmusic',
|
||||||
|
'tags': 'count:14',
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
# geo-restricted
|
# geo-restricted
|
||||||
@ -440,7 +445,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
|||||||
'uploader_id': '9615865',
|
'uploader_id': '9615865',
|
||||||
'timestamp': 1337635207,
|
'timestamp': 1337635207,
|
||||||
'upload_date': '20120521',
|
'upload_date': '20120521',
|
||||||
'duration': 227.155,
|
'duration': 227.103,
|
||||||
'license': 'all-rights-reserved',
|
'license': 'all-rights-reserved',
|
||||||
'view_count': int,
|
'view_count': int,
|
||||||
'like_count': int,
|
'like_count': int,
|
||||||
@ -450,6 +455,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
|||||||
'thumbnail': 'https://i1.sndcdn.com/artworks-v8bFHhXm7Au6-0-original.jpg',
|
'thumbnail': 'https://i1.sndcdn.com/artworks-v8bFHhXm7Au6-0-original.jpg',
|
||||||
'genres': ['Alternative'],
|
'genres': ['Alternative'],
|
||||||
'artists': ['The Royal Concept'],
|
'artists': ['The Royal Concept'],
|
||||||
|
'tags': [],
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
# private link
|
# private link
|
||||||
@ -475,6 +481,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
|||||||
'uploader_url': 'https://soundcloud.com/jaimemf',
|
'uploader_url': 'https://soundcloud.com/jaimemf',
|
||||||
'thumbnail': 'https://a1.sndcdn.com/images/default_avatar_large.png',
|
'thumbnail': 'https://a1.sndcdn.com/images/default_avatar_large.png',
|
||||||
'genres': ['youtubedl'],
|
'genres': ['youtubedl'],
|
||||||
|
'tags': [],
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
# private link (alt format)
|
# private link (alt format)
|
||||||
@ -500,15 +507,16 @@ class SoundcloudIE(SoundcloudBaseIE):
|
|||||||
'uploader_url': 'https://soundcloud.com/jaimemf',
|
'uploader_url': 'https://soundcloud.com/jaimemf',
|
||||||
'thumbnail': 'https://a1.sndcdn.com/images/default_avatar_large.png',
|
'thumbnail': 'https://a1.sndcdn.com/images/default_avatar_large.png',
|
||||||
'genres': ['youtubedl'],
|
'genres': ['youtubedl'],
|
||||||
|
'tags': [],
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
# downloadable song
|
# downloadable song
|
||||||
{
|
{
|
||||||
'url': 'https://soundcloud.com/the80m/the-following',
|
'url': 'https://soundcloud.com/the80m/the-following',
|
||||||
'md5': '9ffcddb08c87d74fb5808a3c183a1d04',
|
'md5': 'ecb87d7705d5f53e6c02a63760573c75', # wav: '9ffcddb08c87d74fb5808a3c183a1d04'
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '343609555',
|
'id': '343609555',
|
||||||
'ext': 'wav',
|
'ext': 'opus', # wav original available with auth
|
||||||
'title': 'The Following',
|
'title': 'The Following',
|
||||||
'track': 'The Following',
|
'track': 'The Following',
|
||||||
'description': '',
|
'description': '',
|
||||||
@ -526,15 +534,18 @@ class SoundcloudIE(SoundcloudBaseIE):
|
|||||||
'view_count': int,
|
'view_count': int,
|
||||||
'genres': ['Dance & EDM'],
|
'genres': ['Dance & EDM'],
|
||||||
'artists': ['80M'],
|
'artists': ['80M'],
|
||||||
|
'tags': ['80M', 'EDM', 'Dance', 'Music'],
|
||||||
},
|
},
|
||||||
|
'expected_warnings': ['Original download format is only available for registered users'],
|
||||||
},
|
},
|
||||||
# private link, downloadable format
|
# private link, downloadable format
|
||||||
|
# tags with spaces (e.g. "Uplifting Trance", "Ori Uplift")
|
||||||
{
|
{
|
||||||
'url': 'https://soundcloud.com/oriuplift/uponly-238-no-talking-wav/s-AyZUd',
|
'url': 'https://soundcloud.com/oriuplift/uponly-238-no-talking-wav/s-AyZUd',
|
||||||
'md5': '64a60b16e617d41d0bef032b7f55441e',
|
'md5': '2e1530d0e9986a833a67cb34fc90ece0', # wav: '64a60b16e617d41d0bef032b7f55441e'
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '340344461',
|
'id': '340344461',
|
||||||
'ext': 'wav',
|
'ext': 'opus', # wav original available with auth
|
||||||
'title': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
|
'title': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
|
||||||
'track': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
|
'track': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
|
||||||
'description': 'md5:fa20ee0fca76a3d6df8c7e57f3715366',
|
'description': 'md5:fa20ee0fca76a3d6df8c7e57f3715366',
|
||||||
@ -552,7 +563,9 @@ class SoundcloudIE(SoundcloudBaseIE):
|
|||||||
'uploader_url': 'https://soundcloud.com/oriuplift',
|
'uploader_url': 'https://soundcloud.com/oriuplift',
|
||||||
'genres': ['Trance'],
|
'genres': ['Trance'],
|
||||||
'artists': ['Ori Uplift'],
|
'artists': ['Ori Uplift'],
|
||||||
|
'tags': ['Orchestral', 'Emotional', 'Uplifting Trance', 'Trance', 'Ori Uplift', 'UpOnly'],
|
||||||
},
|
},
|
||||||
|
'expected_warnings': ['Original download format is only available for registered users'],
|
||||||
},
|
},
|
||||||
# no album art, use avatar pic for thumbnail
|
# no album art, use avatar pic for thumbnail
|
||||||
{
|
{
|
||||||
@ -577,6 +590,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
|||||||
'repost_count': int,
|
'repost_count': int,
|
||||||
'uploader_url': 'https://soundcloud.com/garyvee',
|
'uploader_url': 'https://soundcloud.com/garyvee',
|
||||||
'artists': ['MadReal'],
|
'artists': ['MadReal'],
|
||||||
|
'tags': [],
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
@ -604,8 +618,47 @@ class SoundcloudIE(SoundcloudBaseIE):
|
|||||||
'repost_count': int,
|
'repost_count': int,
|
||||||
'genres': ['Piano'],
|
'genres': ['Piano'],
|
||||||
'uploader_url': 'https://soundcloud.com/giovannisarani',
|
'uploader_url': 'https://soundcloud.com/giovannisarani',
|
||||||
|
'tags': 'count:10',
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
# .png "original" artwork, 160kbps m4a HLS format
|
||||||
|
{
|
||||||
|
'url': 'https://soundcloud.com/skorxh/audio-dealer',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '2011421339',
|
||||||
|
'ext': 'm4a',
|
||||||
|
'title': 'audio dealer',
|
||||||
|
'description': '',
|
||||||
|
'uploader': '$KORCH',
|
||||||
|
'uploader_id': '150292288',
|
||||||
|
'uploader_url': 'https://soundcloud.com/skorxh',
|
||||||
|
'comment_count': int,
|
||||||
|
'view_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
'repost_count': int,
|
||||||
|
'duration': 213.469,
|
||||||
|
'tags': [],
|
||||||
|
'artists': ['$KORXH'],
|
||||||
|
'track': 'audio dealer',
|
||||||
|
'timestamp': 1737143201,
|
||||||
|
'upload_date': '20250117',
|
||||||
|
'license': 'all-rights-reserved',
|
||||||
|
'thumbnail': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-original.png',
|
||||||
|
'thumbnails': [
|
||||||
|
{'id': 'mini', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-mini.jpg'},
|
||||||
|
{'id': 'tiny', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-tiny.jpg'},
|
||||||
|
{'id': 'small', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-small.jpg'},
|
||||||
|
{'id': 'badge', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-badge.jpg'},
|
||||||
|
{'id': 't67x67', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t67x67.jpg'},
|
||||||
|
{'id': 'large', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-large.jpg'},
|
||||||
|
{'id': 't300x300', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t300x300.jpg'},
|
||||||
|
{'id': 'crop', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-crop.jpg'},
|
||||||
|
{'id': 't500x500', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t500x500.jpg'},
|
||||||
|
{'id': 'original', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-original.png'},
|
||||||
|
],
|
||||||
|
},
|
||||||
|
'params': {'skip_download': 'm3u8', 'format': 'hls_aac_160k'},
|
||||||
|
},
|
||||||
{
|
{
|
||||||
# AAC HQ format available (account with active subscription needed)
|
# AAC HQ format available (account with active subscription needed)
|
||||||
'url': 'https://soundcloud.com/wandw/the-chainsmokers-ft-daya-dont-let-me-down-ww-remix-1',
|
'url': 'https://soundcloud.com/wandw/the-chainsmokers-ft-daya-dont-let-me-down-ww-remix-1',
|
||||||
|
@ -1,5 +1,6 @@
|
|||||||
|
from .bunnycdn import BunnyCdnIE
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import try_get, unified_timestamp
|
from ..utils import make_archive_id, try_get, unified_timestamp
|
||||||
|
|
||||||
|
|
||||||
class SovietsClosetBaseIE(InfoExtractor):
|
class SovietsClosetBaseIE(InfoExtractor):
|
||||||
@ -43,7 +44,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
|
|||||||
'url': 'https://sovietscloset.com/video/1337',
|
'url': 'https://sovietscloset.com/video/1337',
|
||||||
'md5': 'bd012b04b261725510ca5383074cdd55',
|
'md5': 'bd012b04b261725510ca5383074cdd55',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1337',
|
'id': '2f0cfbf4-3588-43a9-a7d6-7c9ea3755e67',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'The Witcher #13',
|
'title': 'The Witcher #13',
|
||||||
'thumbnail': r're:^https?://.*\.b-cdn\.net/2f0cfbf4-3588-43a9-a7d6-7c9ea3755e67/thumbnail\.jpg$',
|
'thumbnail': r're:^https?://.*\.b-cdn\.net/2f0cfbf4-3588-43a9-a7d6-7c9ea3755e67/thumbnail\.jpg$',
|
||||||
@ -55,20 +56,23 @@ class SovietsClosetIE(SovietsClosetBaseIE):
|
|||||||
'upload_date': '20170413',
|
'upload_date': '20170413',
|
||||||
'uploader_id': 'SovietWomble',
|
'uploader_id': 'SovietWomble',
|
||||||
'uploader_url': 'https://www.twitch.tv/SovietWomble',
|
'uploader_url': 'https://www.twitch.tv/SovietWomble',
|
||||||
'duration': 7007,
|
'duration': 7008,
|
||||||
'was_live': True,
|
'was_live': True,
|
||||||
'availability': 'public',
|
'availability': 'public',
|
||||||
'series': 'The Witcher',
|
'series': 'The Witcher',
|
||||||
'season': 'Misc',
|
'season': 'Misc',
|
||||||
'episode_number': 13,
|
'episode_number': 13,
|
||||||
'episode': 'Episode 13',
|
'episode': 'Episode 13',
|
||||||
|
'creators': ['SovietWomble'],
|
||||||
|
'description': '',
|
||||||
|
'_old_archive_ids': ['sovietscloset 1337'],
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
'url': 'https://sovietscloset.com/video/1105',
|
'url': 'https://sovietscloset.com/video/1105',
|
||||||
'md5': '89fa928f183893cb65a0b7be846d8a90',
|
'md5': '89fa928f183893cb65a0b7be846d8a90',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1105',
|
'id': 'c0e5e76f-3a93-40b4-bf01-12343c2eec5d',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Arma 3 - Zeus Games #5',
|
'title': 'Arma 3 - Zeus Games #5',
|
||||||
'uploader': 'SovietWomble',
|
'uploader': 'SovietWomble',
|
||||||
@ -80,39 +84,20 @@ class SovietsClosetIE(SovietsClosetBaseIE):
|
|||||||
'upload_date': '20160420',
|
'upload_date': '20160420',
|
||||||
'uploader_id': 'SovietWomble',
|
'uploader_id': 'SovietWomble',
|
||||||
'uploader_url': 'https://www.twitch.tv/SovietWomble',
|
'uploader_url': 'https://www.twitch.tv/SovietWomble',
|
||||||
'duration': 8804,
|
'duration': 8805,
|
||||||
'was_live': True,
|
'was_live': True,
|
||||||
'availability': 'public',
|
'availability': 'public',
|
||||||
'series': 'Arma 3',
|
'series': 'Arma 3',
|
||||||
'season': 'Zeus Games',
|
'season': 'Zeus Games',
|
||||||
'episode_number': 5,
|
'episode_number': 5,
|
||||||
'episode': 'Episode 5',
|
'episode': 'Episode 5',
|
||||||
|
'creators': ['SovietWomble'],
|
||||||
|
'description': '',
|
||||||
|
'_old_archive_ids': ['sovietscloset 1105'],
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
|
|
||||||
def _extract_bunnycdn_iframe(self, video_id, bunnycdn_id):
|
|
||||||
iframe = self._download_webpage(
|
|
||||||
f'https://iframe.mediadelivery.net/embed/5105/{bunnycdn_id}',
|
|
||||||
video_id, note='Downloading BunnyCDN iframe', headers=self.MEDIADELIVERY_REFERER)
|
|
||||||
|
|
||||||
m3u8_url = self._search_regex(r'(https?://.*?\.m3u8)', iframe, 'm3u8 url')
|
|
||||||
thumbnail_url = self._search_regex(r'(https?://.*?thumbnail\.jpg)', iframe, 'thumbnail url')
|
|
||||||
|
|
||||||
m3u8_formats = self._extract_m3u8_formats(m3u8_url, video_id, headers=self.MEDIADELIVERY_REFERER)
|
|
||||||
|
|
||||||
if not m3u8_formats:
|
|
||||||
duration = None
|
|
||||||
else:
|
|
||||||
duration = self._extract_m3u8_vod_duration(
|
|
||||||
m3u8_formats[0]['url'], video_id, headers=self.MEDIADELIVERY_REFERER)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'formats': m3u8_formats,
|
|
||||||
'thumbnail': thumbnail_url,
|
|
||||||
'duration': duration,
|
|
||||||
}
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
@ -122,13 +107,13 @@ def _real_extract(self, url):
|
|||||||
|
|
||||||
stream = self.parse_nuxt_jsonp(f'{static_assets_base}/video/{video_id}/payload.js', video_id, 'video')['stream']
|
stream = self.parse_nuxt_jsonp(f'{static_assets_base}/video/{video_id}/payload.js', video_id, 'video')['stream']
|
||||||
|
|
||||||
return {
|
return self.url_result(
|
||||||
|
f'https://iframe.mediadelivery.net/embed/5105/{stream["bunnyId"]}', ie=BunnyCdnIE, url_transparent=True,
|
||||||
**self.video_meta(
|
**self.video_meta(
|
||||||
video_id=video_id, game_name=stream['game']['name'],
|
video_id=video_id, game_name=stream['game']['name'],
|
||||||
category_name=try_get(stream, lambda x: x['subcategory']['name'], str),
|
category_name=try_get(stream, lambda x: x['subcategory']['name'], str),
|
||||||
episode_number=stream.get('number'), stream_date=stream.get('date')),
|
episode_number=stream.get('number'), stream_date=stream.get('date')),
|
||||||
**self._extract_bunnycdn_iframe(video_id, stream['bunnyId']),
|
_old_archive_ids=[make_archive_id(self, video_id)])
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class SovietsClosetPlaylistIE(SovietsClosetBaseIE):
|
class SovietsClosetPlaylistIE(SovietsClosetBaseIE):
|
||||||
|
@ -249,6 +249,12 @@ def _extract_web_data_and_status(self, url, video_id, fatal=True):
|
|||||||
elif fatal:
|
elif fatal:
|
||||||
raise ExtractorError('Unable to extract webpage video data')
|
raise ExtractorError('Unable to extract webpage video data')
|
||||||
|
|
||||||
|
if not traverse_obj(video_data, ('video', {dict})) and traverse_obj(video_data, ('isContentClassified', {bool})):
|
||||||
|
message = 'This post may not be comfortable for some audiences. Log in for access'
|
||||||
|
if fatal:
|
||||||
|
self.raise_login_required(message)
|
||||||
|
self.report_warning(f'{message}. {self._login_hint()}', video_id=video_id)
|
||||||
|
|
||||||
return video_data, status
|
return video_data, status
|
||||||
|
|
||||||
def _get_subtitles(self, aweme_detail, aweme_id, user_name):
|
def _get_subtitles(self, aweme_detail, aweme_id, user_name):
|
||||||
@ -895,8 +901,12 @@ def _real_extract(self, url):
|
|||||||
|
|
||||||
if video_data and status == 0:
|
if video_data and status == 0:
|
||||||
return self._parse_aweme_video_web(video_data, url, video_id)
|
return self._parse_aweme_video_web(video_data, url, video_id)
|
||||||
elif status == 10216:
|
elif status in (10216, 10222):
|
||||||
raise ExtractorError('This video is private', expected=True)
|
# 10216: private post; 10222: private account
|
||||||
|
self.raise_login_required(
|
||||||
|
'You do not have permission to view this post. Log into an account that has access')
|
||||||
|
elif status == 10204:
|
||||||
|
raise ExtractorError('Your IP address is blocked from accessing this post', expected=True)
|
||||||
raise ExtractorError(f'Video not available, status code {status}', video_id=video_id)
|
raise ExtractorError(f'Video not available, status code {status}', video_id=video_id)
|
||||||
|
|
||||||
|
|
||||||
|
117
yt_dlp/extractor/tvw.py
Normal file
117
yt_dlp/extractor/tvw.py
Normal file
@ -0,0 +1,117 @@
|
|||||||
|
import json
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import clean_html, remove_end, unified_timestamp, url_or_none
|
||||||
|
from ..utils.traversal import traverse_obj
|
||||||
|
|
||||||
|
|
||||||
|
class TvwIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?tvw\.org/video/(?P<id>[^/?#]+)'
|
||||||
|
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://tvw.org/video/billy-frank-jr-statue-maquette-unveiling-ceremony-2024011211/',
|
||||||
|
'md5': '9ceb94fe2bb7fd726f74f16356825703',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '2024011211',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Billy Frank Jr. Statue Maquette Unveiling Ceremony',
|
||||||
|
'thumbnail': r're:^https?://.*\.(?:jpe?g|png)$',
|
||||||
|
'description': 'md5:58a8150017d985b4f377e11ee8f6f36e',
|
||||||
|
'timestamp': 1704902400,
|
||||||
|
'upload_date': '20240110',
|
||||||
|
'location': 'Legislative Building',
|
||||||
|
'display_id': 'billy-frank-jr-statue-maquette-unveiling-ceremony-2024011211',
|
||||||
|
'categories': ['General Interest'],
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://tvw.org/video/ebeys-landing-state-park-2024081007/',
|
||||||
|
'md5': '71e87dae3deafd65d75ff3137b9a32fc',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '2024081007',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Ebey\'s Landing State Park',
|
||||||
|
'thumbnail': r're:^https?://.*\.(?:jpe?g|png)$',
|
||||||
|
'description': 'md5:50c5bd73bde32fa6286a008dbc853386',
|
||||||
|
'timestamp': 1724310900,
|
||||||
|
'upload_date': '20240822',
|
||||||
|
'location': 'Ebey’s Landing State Park',
|
||||||
|
'display_id': 'ebeys-landing-state-park-2024081007',
|
||||||
|
'categories': ['Washington State Parks'],
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://tvw.org/video/home-warranties-workgroup-2',
|
||||||
|
'md5': 'f678789bf94d07da89809f213cf37150',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1999121000',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Home Warranties Workgroup',
|
||||||
|
'thumbnail': r're:^https?://.*\.(?:jpe?g|png)$',
|
||||||
|
'description': 'md5:861396cc523c9641d0dce690bc5c35f3',
|
||||||
|
'timestamp': 946389600,
|
||||||
|
'upload_date': '19991228',
|
||||||
|
'display_id': 'home-warranties-workgroup-2',
|
||||||
|
'categories': ['Legislative'],
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://tvw.org/video/washington-to-washington-a-new-space-race-2022041111/?eventID=2022041111',
|
||||||
|
'md5': '6f5551090b351aba10c0d08a881b4f30',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '2022041111',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Washington to Washington - A New Space Race',
|
||||||
|
'thumbnail': r're:^https?://.*\.(?:jpe?g|png)$',
|
||||||
|
'description': 'md5:f65a24eec56107afbcebb3aa5cd26341',
|
||||||
|
'timestamp': 1650394800,
|
||||||
|
'upload_date': '20220419',
|
||||||
|
'location': 'Hayner Media Center',
|
||||||
|
'display_id': 'washington-to-washington-a-new-space-race-2022041111',
|
||||||
|
'categories': ['Washington to Washington', 'General Interest'],
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
client_id = self._html_search_meta('clientID', webpage, fatal=True)
|
||||||
|
video_id = self._html_search_meta('eventID', webpage, fatal=True)
|
||||||
|
|
||||||
|
video_data = self._download_json(
|
||||||
|
'https://api.v3.invintus.com/v2/Event/getDetailed', video_id,
|
||||||
|
headers={
|
||||||
|
'authorization': 'embedder',
|
||||||
|
'wsc-api-key': '7WhiEBzijpritypp8bqcU7pfU9uicDR',
|
||||||
|
},
|
||||||
|
data=json.dumps({
|
||||||
|
'clientID': client_id,
|
||||||
|
'eventID': video_id,
|
||||||
|
'showStreams': True,
|
||||||
|
}).encode())['data']
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
subtitles = {}
|
||||||
|
for stream_url in traverse_obj(video_data, ('streamingURIs', ..., {url_or_none})):
|
||||||
|
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||||
|
stream_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
|
||||||
|
formats.extend(fmts)
|
||||||
|
self._merge_subtitles(subs, target=subtitles)
|
||||||
|
if caption_url := traverse_obj(video_data, ('captionPath', {url_or_none})):
|
||||||
|
subtitles.setdefault('en', []).append({'url': caption_url, 'ext': 'vtt'})
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'display_id': display_id,
|
||||||
|
'formats': formats,
|
||||||
|
'subtitles': subtitles,
|
||||||
|
'title': remove_end(self._og_search_title(webpage, default=None), ' - TVW'),
|
||||||
|
'description': self._og_search_description(webpage, default=None),
|
||||||
|
**traverse_obj(video_data, {
|
||||||
|
'title': ('title', {str}),
|
||||||
|
'description': ('description', {clean_html}),
|
||||||
|
'categories': ('categories', ..., {str}),
|
||||||
|
'thumbnail': ('videoThumbnail', {url_or_none}),
|
||||||
|
'timestamp': ('startDateTime', {unified_timestamp}),
|
||||||
|
'location': ('locationName', {str}),
|
||||||
|
'is_live': ('eventStatus', {lambda x: x == 'live'}),
|
||||||
|
}),
|
||||||
|
}
|
@ -1,11 +1,12 @@
|
|||||||
import functools
|
import functools
|
||||||
import json
|
import json
|
||||||
import random
|
import math
|
||||||
import re
|
import re
|
||||||
import urllib.parse
|
import urllib.parse
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .periscope import PeriscopeBaseIE, PeriscopeIE
|
from .periscope import PeriscopeBaseIE, PeriscopeIE
|
||||||
|
from ..jsinterp import js_number_to_string
|
||||||
from ..networking.exceptions import HTTPError
|
from ..networking.exceptions import HTTPError
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
@ -1330,6 +1331,11 @@ def _build_graphql_query(self, media_id):
|
|||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|
||||||
|
def _generate_syndication_token(self, twid):
|
||||||
|
# ((Number(twid) / 1e15) * Math.PI).toString(36).replace(/(0+|\.)/g, '')
|
||||||
|
translation = str.maketrans(dict.fromkeys('0.'))
|
||||||
|
return js_number_to_string((int(twid) / 1e15) * math.pi, 36).translate(translation)
|
||||||
|
|
||||||
def _call_syndication_api(self, twid):
|
def _call_syndication_api(self, twid):
|
||||||
self.report_warning(
|
self.report_warning(
|
||||||
'Not all metadata or media is available via syndication endpoint', twid, only_once=True)
|
'Not all metadata or media is available via syndication endpoint', twid, only_once=True)
|
||||||
@ -1337,8 +1343,7 @@ def _call_syndication_api(self, twid):
|
|||||||
'https://cdn.syndication.twimg.com/tweet-result', twid, 'Downloading syndication JSON',
|
'https://cdn.syndication.twimg.com/tweet-result', twid, 'Downloading syndication JSON',
|
||||||
headers={'User-Agent': 'Googlebot'}, query={
|
headers={'User-Agent': 'Googlebot'}, query={
|
||||||
'id': twid,
|
'id': twid,
|
||||||
# TODO: token = ((Number(twid) / 1e15) * Math.PI).toString(36).replace(/(0+|\.)/g, '')
|
'token': self._generate_syndication_token(twid),
|
||||||
'token': ''.join(random.choices('123456789abcdefghijklmnopqrstuvwxyz', k=10)),
|
|
||||||
})
|
})
|
||||||
if not status:
|
if not status:
|
||||||
raise ExtractorError('Syndication endpoint returned empty JSON response')
|
raise ExtractorError('Syndication endpoint returned empty JSON response')
|
||||||
|
@ -116,6 +116,7 @@ class VKIE(VKBaseIE):
|
|||||||
'id': '-77521_162222515',
|
'id': '-77521_162222515',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'ProtivoGunz - Хуёвая песня',
|
'title': 'ProtivoGunz - Хуёвая песня',
|
||||||
|
'description': 'Видео из официальной группы Noize MC\nhttp://vk.com/noizemc',
|
||||||
'uploader': 're:(?:Noize MC|Alexander Ilyashenko).*',
|
'uploader': 're:(?:Noize MC|Alexander Ilyashenko).*',
|
||||||
'uploader_id': '39545378',
|
'uploader_id': '39545378',
|
||||||
'duration': 195,
|
'duration': 195,
|
||||||
@ -165,6 +166,7 @@ class VKIE(VKBaseIE):
|
|||||||
'id': '-93049196_456239755',
|
'id': '-93049196_456239755',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': '8 серия (озвучка)',
|
'title': '8 серия (озвучка)',
|
||||||
|
'description': 'Видео из официальной группы Noize MC\nhttp://vk.com/noizemc',
|
||||||
'duration': 8383,
|
'duration': 8383,
|
||||||
'comment_count': int,
|
'comment_count': int,
|
||||||
'uploader': 'Dizi2021',
|
'uploader': 'Dizi2021',
|
||||||
@ -240,6 +242,7 @@ class VKIE(VKBaseIE):
|
|||||||
'upload_date': '20221005',
|
'upload_date': '20221005',
|
||||||
'uploader': 'Шальная Императрица',
|
'uploader': 'Шальная Императрица',
|
||||||
'uploader_id': '-74006511',
|
'uploader_id': '-74006511',
|
||||||
|
'description': 'md5:f9315f7786fa0e84e75e4f824a48b056',
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -278,6 +281,25 @@ class VKIE(VKBaseIE):
|
|||||||
},
|
},
|
||||||
'skip': 'No formats found',
|
'skip': 'No formats found',
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
'note': 'video has chapters',
|
||||||
|
'url': 'https://vkvideo.ru/video-18403220_456239696',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '-18403220_456239696',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Трамп отменяет гранты // DeepSeek - Революция в ИИ // Илон Маск читер',
|
||||||
|
'description': 'md5:b112ea9de53683b6d03d29076f62eec2',
|
||||||
|
'uploader': 'Руслан Усачев',
|
||||||
|
'uploader_id': '-18403220',
|
||||||
|
'comment_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
'duration': 1983,
|
||||||
|
'thumbnail': r're:https?://.+\.jpg',
|
||||||
|
'chapters': 'count:21',
|
||||||
|
'timestamp': 1738252883,
|
||||||
|
'upload_date': '20250130',
|
||||||
|
},
|
||||||
|
},
|
||||||
{
|
{
|
||||||
# live stream, hls and rtmp links, most likely already finished live
|
# live stream, hls and rtmp links, most likely already finished live
|
||||||
# stream by the time you are reading this comment
|
# stream by the time you are reading this comment
|
||||||
@ -449,7 +471,6 @@ def _real_extract(self, url):
|
|||||||
return self.url_result(opts_url)
|
return self.url_result(opts_url)
|
||||||
|
|
||||||
data = player['params'][0]
|
data = player['params'][0]
|
||||||
title = unescapeHTML(data['md_title'])
|
|
||||||
|
|
||||||
# 2 = live
|
# 2 = live
|
||||||
# 3 = post live (finished live)
|
# 3 = post live (finished live)
|
||||||
@ -507,17 +528,29 @@ def _real_extract(self, url):
|
|||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'title': title,
|
'subtitles': subtitles,
|
||||||
'thumbnail': data.get('jpg'),
|
**traverse_obj(mv_data, {
|
||||||
'uploader': data.get('md_author'),
|
'title': ('title', {unescapeHTML}),
|
||||||
'uploader_id': str_or_none(data.get('author_id') or mv_data.get('authorId')),
|
'description': ('desc', {clean_html}, filter),
|
||||||
'duration': int_or_none(data.get('duration') or mv_data.get('duration')),
|
'duration': ('duration', {int_or_none}),
|
||||||
|
'like_count': ('likes', {int_or_none}),
|
||||||
|
'comment_count': ('commcount', {int_or_none}),
|
||||||
|
}),
|
||||||
|
**traverse_obj(data, {
|
||||||
|
'title': ('md_title', {unescapeHTML}),
|
||||||
|
'description': ('description', {clean_html}, filter),
|
||||||
|
'thumbnail': ('jpg', {url_or_none}),
|
||||||
|
'uploader': ('md_author', {str}),
|
||||||
|
'uploader_id': (('author_id', 'authorId'), {str_or_none}, any),
|
||||||
|
'duration': ('duration', {int_or_none}),
|
||||||
|
'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), {
|
||||||
|
'title': ('text', {str}),
|
||||||
|
'start_time': 'time',
|
||||||
|
}),
|
||||||
|
}),
|
||||||
'timestamp': timestamp,
|
'timestamp': timestamp,
|
||||||
'view_count': view_count,
|
'view_count': view_count,
|
||||||
'like_count': int_or_none(mv_data.get('likes')),
|
|
||||||
'comment_count': int_or_none(mv_data.get('commcount')),
|
|
||||||
'is_live': is_live,
|
'is_live': is_live,
|
||||||
'subtitles': subtitles,
|
|
||||||
'_format_sort_fields': ('res', 'source'),
|
'_format_sort_fields': ('res', 'source'),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -12,6 +12,7 @@
|
|||||||
str_or_none,
|
str_or_none,
|
||||||
strip_jsonp,
|
strip_jsonp,
|
||||||
traverse_obj,
|
traverse_obj,
|
||||||
|
truncate_string,
|
||||||
url_or_none,
|
url_or_none,
|
||||||
urlencode_postdata,
|
urlencode_postdata,
|
||||||
urljoin,
|
urljoin,
|
||||||
@ -96,7 +97,8 @@ def _extract_formats(self, video_info):
|
|||||||
})
|
})
|
||||||
return formats
|
return formats
|
||||||
|
|
||||||
def _parse_video_info(self, video_info, video_id=None):
|
def _parse_video_info(self, video_info):
|
||||||
|
video_id = traverse_obj(video_info, (('id', 'id_str', 'mid'), {str_or_none}, any))
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'extractor_key': WeiboIE.ie_key(),
|
'extractor_key': WeiboIE.ie_key(),
|
||||||
@ -105,9 +107,10 @@ def _parse_video_info(self, video_info, video_id=None):
|
|||||||
'http_headers': {'Referer': 'https://weibo.com/'},
|
'http_headers': {'Referer': 'https://weibo.com/'},
|
||||||
'_old_archive_ids': [make_archive_id('WeiboMobile', video_id)],
|
'_old_archive_ids': [make_archive_id('WeiboMobile', video_id)],
|
||||||
**traverse_obj(video_info, {
|
**traverse_obj(video_info, {
|
||||||
'id': (('id', 'id_str', 'mid'), {str_or_none}),
|
|
||||||
'display_id': ('mblogid', {str_or_none}),
|
'display_id': ('mblogid', {str_or_none}),
|
||||||
'title': ('page_info', 'media_info', ('video_title', 'kol_title', 'name'), {str}, filter),
|
'title': ('page_info', 'media_info', ('video_title', 'kol_title', 'name'),
|
||||||
|
{lambda x: x.replace('\n', ' ')}, {truncate_string(left=50)}, filter),
|
||||||
|
'alt_title': ('page_info', 'media_info', ('video_title', 'kol_title', 'name'), {str}, filter),
|
||||||
'description': ('text_raw', {str}),
|
'description': ('text_raw', {str}),
|
||||||
'duration': ('page_info', 'media_info', 'duration', {int_or_none}),
|
'duration': ('page_info', 'media_info', 'duration', {int_or_none}),
|
||||||
'timestamp': ('page_info', 'media_info', 'video_publish_time', {int_or_none}),
|
'timestamp': ('page_info', 'media_info', 'video_publish_time', {int_or_none}),
|
||||||
@ -129,9 +132,11 @@ class WeiboIE(WeiboBaseIE):
|
|||||||
'url': 'https://weibo.com/7827771738/N4xlMvjhI',
|
'url': 'https://weibo.com/7827771738/N4xlMvjhI',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '4910815147462302',
|
'id': '4910815147462302',
|
||||||
|
'_old_archive_ids': ['weibomobile 4910815147462302'],
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'display_id': 'N4xlMvjhI',
|
'display_id': 'N4xlMvjhI',
|
||||||
'title': '【睡前消息暑假版第一期:拉泰国一把 对中国有好处】',
|
'title': '【睡前消息暑假版第一期:拉泰国一把 对中国有好处】',
|
||||||
|
'alt_title': '【睡前消息暑假版第一期:拉泰国一把 对中国有好处】',
|
||||||
'description': 'md5:e2637a7673980d68694ea7c43cf12a5f',
|
'description': 'md5:e2637a7673980d68694ea7c43cf12a5f',
|
||||||
'duration': 918,
|
'duration': 918,
|
||||||
'timestamp': 1686312819,
|
'timestamp': 1686312819,
|
||||||
@ -149,9 +154,11 @@ class WeiboIE(WeiboBaseIE):
|
|||||||
'url': 'https://m.weibo.cn/status/4189191225395228',
|
'url': 'https://m.weibo.cn/status/4189191225395228',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '4189191225395228',
|
'id': '4189191225395228',
|
||||||
|
'_old_archive_ids': ['weibomobile 4189191225395228'],
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'display_id': 'FBqgOmDxO',
|
'display_id': 'FBqgOmDxO',
|
||||||
'title': '柴犬柴犬的秒拍视频',
|
'title': '柴犬柴犬的秒拍视频',
|
||||||
|
'alt_title': '柴犬柴犬的秒拍视频',
|
||||||
'description': 'md5:80f461ab5cdae6bbdb70efbf5a1db24f',
|
'description': 'md5:80f461ab5cdae6bbdb70efbf5a1db24f',
|
||||||
'duration': 53,
|
'duration': 53,
|
||||||
'timestamp': 1514264429,
|
'timestamp': 1514264429,
|
||||||
@ -166,34 +173,35 @@ class WeiboIE(WeiboBaseIE):
|
|||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://m.weibo.cn/detail/4189191225395228',
|
'url': 'https://m.weibo.cn/detail/4189191225395228',
|
||||||
'info_dict': {
|
'only_matching': True,
|
||||||
'id': '4189191225395228',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'display_id': 'FBqgOmDxO',
|
|
||||||
'title': '柴犬柴犬的秒拍视频',
|
|
||||||
'description': '午睡当然是要甜甜蜜蜜的啦![坏笑] Instagram:shibainu.gaku http://t.cn/RHbmjzW ',
|
|
||||||
'duration': 53,
|
|
||||||
'timestamp': 1514264429,
|
|
||||||
'upload_date': '20171226',
|
|
||||||
'thumbnail': r're:https://.*\.jpg',
|
|
||||||
'uploader': '柴犬柴犬',
|
|
||||||
'uploader_id': '5926682210',
|
|
||||||
'uploader_url': 'https://weibo.com/u/5926682210',
|
|
||||||
'view_count': int,
|
|
||||||
'like_count': int,
|
|
||||||
'repost_count': int,
|
|
||||||
},
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://weibo.com/0/4224132150961381',
|
'url': 'https://weibo.com/0/4224132150961381',
|
||||||
'note': 'no playback_list example',
|
'note': 'no playback_list example',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://m.weibo.cn/detail/5120561132606436',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '5120561132606436',
|
||||||
|
},
|
||||||
|
'playlist_count': 9,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
return self._parse_video_info(self._weibo_download_json(
|
meta = self._weibo_download_json(f'https://weibo.com/ajax/statuses/show?id={video_id}', video_id)
|
||||||
f'https://weibo.com/ajax/statuses/show?id={video_id}', video_id))
|
mix_media_info = traverse_obj(meta, ('mix_media_info', 'items', ...))
|
||||||
|
if not mix_media_info:
|
||||||
|
return self._parse_video_info(meta)
|
||||||
|
|
||||||
|
return self.playlist_result(self._entries(mix_media_info), video_id)
|
||||||
|
|
||||||
|
def _entries(self, mix_media_info):
|
||||||
|
for media_info in traverse_obj(mix_media_info, lambda _, v: v['type'] != 'pic'):
|
||||||
|
yield self._parse_video_info(traverse_obj(media_info, {
|
||||||
|
'id': ('data', 'object_id'),
|
||||||
|
'page_info': {'media_info': ('data', 'media_info', {dict})},
|
||||||
|
}))
|
||||||
|
|
||||||
|
|
||||||
class WeiboVideoIE(WeiboBaseIE):
|
class WeiboVideoIE(WeiboBaseIE):
|
||||||
|
@ -100,8 +100,8 @@ def _real_extract(self, url):
|
|||||||
|
|
||||||
|
|
||||||
class WSJArticleIE(InfoExtractor):
|
class WSJArticleIE(InfoExtractor):
|
||||||
_VALID_URL = r'(?i)https?://(?:www\.)?wsj\.com/articles/(?P<id>[^/?#&]+)'
|
_VALID_URL = r'(?i)https?://(?:www\.)?wsj\.com/(?:articles|opinion)/(?P<id>[^/?#&]+)'
|
||||||
_TEST = {
|
_TESTS = [{
|
||||||
'url': 'https://www.wsj.com/articles/dont-like-china-no-pandas-for-you-1490366939?',
|
'url': 'https://www.wsj.com/articles/dont-like-china-no-pandas-for-you-1490366939?',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '4B13FA62-1D8C-45DB-8EA1-4105CB20B362',
|
'id': '4B13FA62-1D8C-45DB-8EA1-4105CB20B362',
|
||||||
@ -110,11 +110,20 @@ class WSJArticleIE(InfoExtractor):
|
|||||||
'uploader_id': 'ralcaraz',
|
'uploader_id': 'ralcaraz',
|
||||||
'title': 'Bao Bao the Panda Leaves for China',
|
'title': 'Bao Bao the Panda Leaves for China',
|
||||||
},
|
},
|
||||||
}
|
}, {
|
||||||
|
'url': 'https://www.wsj.com/opinion/hamas-hostages-caskets-bibas-family-israel-gaza-29da083b',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'CE68D629-8DB8-4CD3-B30A-92112C102054',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'upload_date': '20241007',
|
||||||
|
'uploader_id': 'Tinnes, David',
|
||||||
|
'title': 'WSJ Opinion: "Get the Jew": The Crown Heights Riot Revisited',
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
article_id = self._match_id(url)
|
article_id = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, article_id)
|
webpage = self._download_webpage(url, article_id, impersonate=True)
|
||||||
video_id = self._search_regex(
|
video_id = self._search_regex(
|
||||||
r'(?:id=["\']video|video-|iframe\.html\?guid=|data-src=["\'])([a-fA-F0-9-]{36})',
|
r'(?:id=["\']video|video-|iframe\.html\?guid=|data-src=["\'])([a-fA-F0-9-]{36})',
|
||||||
webpage, 'video id')
|
webpage, 'video id')
|
||||||
|
50
yt_dlp/extractor/youtube/__init__.py
Normal file
50
yt_dlp/extractor/youtube/__init__.py
Normal file
@ -0,0 +1,50 @@
|
|||||||
|
# flake8: noqa: F401
|
||||||
|
from ._base import YoutubeBaseInfoExtractor
|
||||||
|
from ._clip import YoutubeClipIE
|
||||||
|
from ._mistakes import YoutubeTruncatedIDIE, YoutubeTruncatedURLIE
|
||||||
|
from ._notifications import YoutubeNotificationsIE
|
||||||
|
from ._redirect import (
|
||||||
|
YoutubeConsentRedirectIE,
|
||||||
|
YoutubeFavouritesIE,
|
||||||
|
YoutubeFeedsInfoExtractor,
|
||||||
|
YoutubeHistoryIE,
|
||||||
|
YoutubeLivestreamEmbedIE,
|
||||||
|
YoutubeRecommendedIE,
|
||||||
|
YoutubeShortsAudioPivotIE,
|
||||||
|
YoutubeSubscriptionsIE,
|
||||||
|
YoutubeWatchLaterIE,
|
||||||
|
YoutubeYtBeIE,
|
||||||
|
YoutubeYtUserIE,
|
||||||
|
)
|
||||||
|
from ._search import YoutubeMusicSearchURLIE, YoutubeSearchDateIE, YoutubeSearchIE, YoutubeSearchURLIE
|
||||||
|
from ._tab import YoutubePlaylistIE, YoutubeTabBaseInfoExtractor, YoutubeTabIE
|
||||||
|
from ._video import YoutubeIE
|
||||||
|
|
||||||
|
# Hack to allow plugin overrides work
|
||||||
|
for _cls in [
|
||||||
|
YoutubeBaseInfoExtractor,
|
||||||
|
YoutubeClipIE,
|
||||||
|
YoutubeTruncatedIDIE,
|
||||||
|
YoutubeTruncatedURLIE,
|
||||||
|
YoutubeNotificationsIE,
|
||||||
|
YoutubeConsentRedirectIE,
|
||||||
|
YoutubeFavouritesIE,
|
||||||
|
YoutubeFeedsInfoExtractor,
|
||||||
|
YoutubeHistoryIE,
|
||||||
|
YoutubeLivestreamEmbedIE,
|
||||||
|
YoutubeRecommendedIE,
|
||||||
|
YoutubeShortsAudioPivotIE,
|
||||||
|
YoutubeSubscriptionsIE,
|
||||||
|
YoutubeWatchLaterIE,
|
||||||
|
YoutubeYtBeIE,
|
||||||
|
YoutubeYtUserIE,
|
||||||
|
YoutubeMusicSearchURLIE,
|
||||||
|
YoutubeSearchDateIE,
|
||||||
|
YoutubeSearchIE,
|
||||||
|
YoutubeSearchURLIE,
|
||||||
|
YoutubePlaylistIE,
|
||||||
|
YoutubeTabBaseInfoExtractor,
|
||||||
|
YoutubeTabIE,
|
||||||
|
YoutubeIE,
|
||||||
|
]:
|
||||||
|
_cls.__module__ = 'yt_dlp.extractor.youtube'
|
1145
yt_dlp/extractor/youtube/_base.py
Normal file
1145
yt_dlp/extractor/youtube/_base.py
Normal file
File diff suppressed because it is too large
Load Diff
66
yt_dlp/extractor/youtube/_clip.py
Normal file
66
yt_dlp/extractor/youtube/_clip.py
Normal file
@ -0,0 +1,66 @@
|
|||||||
|
from ._tab import YoutubeTabBaseInfoExtractor
|
||||||
|
from ._video import YoutubeIE
|
||||||
|
from ...utils import ExtractorError, traverse_obj
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeClipIE(YoutubeTabBaseInfoExtractor):
|
||||||
|
IE_NAME = 'youtube:clip'
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?youtube\.com/clip/(?P<id>[^/?#]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
# FIXME: Other metadata should be extracted from the clip, not from the base video
|
||||||
|
'url': 'https://www.youtube.com/clip/UgytZKpehg-hEMBSn3F4AaABCQ',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'UgytZKpehg-hEMBSn3F4AaABCQ',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'section_start': 29.0,
|
||||||
|
'section_end': 39.7,
|
||||||
|
'duration': 10.7,
|
||||||
|
'age_limit': 0,
|
||||||
|
'availability': 'public',
|
||||||
|
'categories': ['Gaming'],
|
||||||
|
'channel': 'Scott The Woz',
|
||||||
|
'channel_id': 'UC4rqhyiTs7XyuODcECvuiiQ',
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UC4rqhyiTs7XyuODcECvuiiQ',
|
||||||
|
'description': 'md5:7a4517a17ea9b4bd98996399d8bb36e7',
|
||||||
|
'like_count': int,
|
||||||
|
'playable_in_embed': True,
|
||||||
|
'tags': 'count:17',
|
||||||
|
'thumbnail': 'https://i.ytimg.com/vi_webp/ScPX26pdQik/maxresdefault.webp',
|
||||||
|
'title': 'Mobile Games on Console - Scott The Woz',
|
||||||
|
'upload_date': '20210920',
|
||||||
|
'uploader': 'Scott The Woz',
|
||||||
|
'uploader_id': '@ScottTheWoz',
|
||||||
|
'uploader_url': 'https://www.youtube.com/@ScottTheWoz',
|
||||||
|
'view_count': int,
|
||||||
|
'live_status': 'not_live',
|
||||||
|
'channel_follower_count': int,
|
||||||
|
'chapters': 'count:20',
|
||||||
|
'comment_count': int,
|
||||||
|
'heatmap': 'count:100',
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
clip_id = self._match_id(url)
|
||||||
|
_, data = self._extract_webpage(url, clip_id)
|
||||||
|
|
||||||
|
video_id = traverse_obj(data, ('currentVideoEndpoint', 'watchEndpoint', 'videoId'))
|
||||||
|
if not video_id:
|
||||||
|
raise ExtractorError('Unable to find video ID')
|
||||||
|
|
||||||
|
clip_data = traverse_obj(data, (
|
||||||
|
'engagementPanels', ..., 'engagementPanelSectionListRenderer', 'content', 'clipSectionRenderer',
|
||||||
|
'contents', ..., 'clipAttributionRenderer', 'onScrubExit', 'commandExecutorCommand', 'commands', ...,
|
||||||
|
'openPopupAction', 'popup', 'notificationActionRenderer', 'actionButton', 'buttonRenderer', 'command',
|
||||||
|
'commandExecutorCommand', 'commands', ..., 'loopCommand'), get_all=False)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'_type': 'url_transparent',
|
||||||
|
'url': f'https://www.youtube.com/watch?v={video_id}',
|
||||||
|
'ie_key': YoutubeIE.ie_key(),
|
||||||
|
'id': clip_id,
|
||||||
|
'section_start': int(clip_data['startTimeMs']) / 1000,
|
||||||
|
'section_end': int(clip_data['endTimeMs']) / 1000,
|
||||||
|
'_format_sort_fields': ( # https protocol is prioritized for ffmpeg compatibility
|
||||||
|
'proto:https', 'quality', 'res', 'fps', 'hdr:12', 'source', 'vcodec', 'channels', 'acodec', 'lang'),
|
||||||
|
}
|
69
yt_dlp/extractor/youtube/_mistakes.py
Normal file
69
yt_dlp/extractor/youtube/_mistakes.py
Normal file
@ -0,0 +1,69 @@
|
|||||||
|
|
||||||
|
from ._base import YoutubeBaseInfoExtractor
|
||||||
|
from ...utils import ExtractorError
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeTruncatedURLIE(YoutubeBaseInfoExtractor):
|
||||||
|
IE_NAME = 'youtube:truncated_url'
|
||||||
|
IE_DESC = False # Do not list
|
||||||
|
_VALID_URL = r'''(?x)
|
||||||
|
(?:https?://)?
|
||||||
|
(?:\w+\.)?[yY][oO][uU][tT][uU][bB][eE](?:-nocookie)?\.com/
|
||||||
|
(?:watch\?(?:
|
||||||
|
feature=[a-z_]+|
|
||||||
|
annotation_id=annotation_[^&]+|
|
||||||
|
x-yt-cl=[0-9]+|
|
||||||
|
hl=[^&]*|
|
||||||
|
t=[0-9]+
|
||||||
|
)?
|
||||||
|
|
|
||||||
|
attribution_link\?a=[^&]+
|
||||||
|
)
|
||||||
|
$
|
||||||
|
'''
|
||||||
|
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.youtube.com/watch?annotation_id=annotation_3951667041',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.youtube.com/watch?',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.youtube.com/watch?x-yt-cl=84503534',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.youtube.com/watch?feature=foo',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.youtube.com/watch?hl=en-GB',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.youtube.com/watch?t=2372',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
raise ExtractorError(
|
||||||
|
'Did you forget to quote the URL? Remember that & is a meta '
|
||||||
|
'character in most shells, so you want to put the URL in quotes, '
|
||||||
|
'like yt-dlp '
|
||||||
|
'"https://www.youtube.com/watch?feature=foo&v=BaW_jenozKc" '
|
||||||
|
' or simply yt-dlp BaW_jenozKc .',
|
||||||
|
expected=True)
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeTruncatedIDIE(YoutubeBaseInfoExtractor):
|
||||||
|
IE_NAME = 'youtube:truncated_id'
|
||||||
|
IE_DESC = False # Do not list
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?youtube\.com/watch\?v=(?P<id>[0-9A-Za-z_-]{1,10})$'
|
||||||
|
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.youtube.com/watch?v=N_708QY7Ob',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
raise ExtractorError(
|
||||||
|
f'Incomplete YouTube ID {video_id}. URL {url} looks truncated.',
|
||||||
|
expected=True)
|
98
yt_dlp/extractor/youtube/_notifications.py
Normal file
98
yt_dlp/extractor/youtube/_notifications.py
Normal file
@ -0,0 +1,98 @@
|
|||||||
|
import itertools
|
||||||
|
import re
|
||||||
|
|
||||||
|
from ._tab import YoutubeTabBaseInfoExtractor, YoutubeTabIE
|
||||||
|
from ._video import YoutubeIE
|
||||||
|
from ...utils import traverse_obj
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeNotificationsIE(YoutubeTabBaseInfoExtractor):
|
||||||
|
IE_NAME = 'youtube:notif'
|
||||||
|
IE_DESC = 'YouTube notifications; ":ytnotif" keyword (requires cookies)'
|
||||||
|
_VALID_URL = r':ytnotif(?:ication)?s?'
|
||||||
|
_LOGIN_REQUIRED = True
|
||||||
|
_TESTS = [{
|
||||||
|
'url': ':ytnotif',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': ':ytnotifications',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _extract_notification_menu(self, response, continuation_list):
|
||||||
|
notification_list = traverse_obj(
|
||||||
|
response,
|
||||||
|
('actions', 0, 'openPopupAction', 'popup', 'multiPageMenuRenderer', 'sections', 0, 'multiPageMenuNotificationSectionRenderer', 'items'),
|
||||||
|
('actions', 0, 'appendContinuationItemsAction', 'continuationItems'),
|
||||||
|
expected_type=list) or []
|
||||||
|
continuation_list[0] = None
|
||||||
|
for item in notification_list:
|
||||||
|
entry = self._extract_notification_renderer(item.get('notificationRenderer'))
|
||||||
|
if entry:
|
||||||
|
yield entry
|
||||||
|
continuation = item.get('continuationItemRenderer')
|
||||||
|
if continuation:
|
||||||
|
continuation_list[0] = continuation
|
||||||
|
|
||||||
|
def _extract_notification_renderer(self, notification):
|
||||||
|
video_id = traverse_obj(
|
||||||
|
notification, ('navigationEndpoint', 'watchEndpoint', 'videoId'), expected_type=str)
|
||||||
|
url = f'https://www.youtube.com/watch?v={video_id}'
|
||||||
|
channel_id = None
|
||||||
|
if not video_id:
|
||||||
|
browse_ep = traverse_obj(
|
||||||
|
notification, ('navigationEndpoint', 'browseEndpoint'), expected_type=dict)
|
||||||
|
channel_id = self.ucid_or_none(traverse_obj(browse_ep, 'browseId', expected_type=str))
|
||||||
|
post_id = self._search_regex(
|
||||||
|
r'/post/(.+)', traverse_obj(browse_ep, 'canonicalBaseUrl', expected_type=str),
|
||||||
|
'post id', default=None)
|
||||||
|
if not channel_id or not post_id:
|
||||||
|
return
|
||||||
|
# The direct /post url redirects to this in the browser
|
||||||
|
url = f'https://www.youtube.com/channel/{channel_id}/community?lb={post_id}'
|
||||||
|
|
||||||
|
channel = traverse_obj(
|
||||||
|
notification, ('contextualMenu', 'menuRenderer', 'items', 1, 'menuServiceItemRenderer', 'text', 'runs', 1, 'text'),
|
||||||
|
expected_type=str)
|
||||||
|
notification_title = self._get_text(notification, 'shortMessage')
|
||||||
|
if notification_title:
|
||||||
|
notification_title = notification_title.replace('\xad', '') # remove soft hyphens
|
||||||
|
# TODO: handle recommended videos
|
||||||
|
title = self._search_regex(
|
||||||
|
rf'{re.escape(channel or "")}[^:]+: (.+)', notification_title,
|
||||||
|
'video title', default=None)
|
||||||
|
timestamp = (self._parse_time_text(self._get_text(notification, 'sentTimeText'))
|
||||||
|
if self._configuration_arg('approximate_date', ie_key=YoutubeTabIE)
|
||||||
|
else None)
|
||||||
|
return {
|
||||||
|
'_type': 'url',
|
||||||
|
'url': url,
|
||||||
|
'ie_key': (YoutubeIE if video_id else YoutubeTabIE).ie_key(),
|
||||||
|
'video_id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'channel_id': channel_id,
|
||||||
|
'channel': channel,
|
||||||
|
'uploader': channel,
|
||||||
|
'thumbnails': self._extract_thumbnails(notification, 'videoThumbnail'),
|
||||||
|
'timestamp': timestamp,
|
||||||
|
}
|
||||||
|
|
||||||
|
def _notification_menu_entries(self, ytcfg):
|
||||||
|
continuation_list = [None]
|
||||||
|
response = None
|
||||||
|
for page in itertools.count(1):
|
||||||
|
ctoken = traverse_obj(
|
||||||
|
continuation_list, (0, 'continuationEndpoint', 'getNotificationMenuEndpoint', 'ctoken'), expected_type=str)
|
||||||
|
response = self._extract_response(
|
||||||
|
item_id=f'page {page}', query={'ctoken': ctoken} if ctoken else {}, ytcfg=ytcfg,
|
||||||
|
ep='notification/get_notification_menu', check_get_keys='actions',
|
||||||
|
headers=self.generate_api_headers(ytcfg=ytcfg, visitor_data=self._extract_visitor_data(response)))
|
||||||
|
yield from self._extract_notification_menu(response, continuation_list)
|
||||||
|
if not continuation_list[0]:
|
||||||
|
break
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = 'notifications'
|
||||||
|
ytcfg = self._download_ytcfg('web', display_id) if not self.skip_webpage else {}
|
||||||
|
self._report_playlist_authcheck(ytcfg)
|
||||||
|
return self.playlist_result(self._notification_menu_entries(ytcfg), display_id, display_id)
|
247
yt_dlp/extractor/youtube/_redirect.py
Normal file
247
yt_dlp/extractor/youtube/_redirect.py
Normal file
@ -0,0 +1,247 @@
|
|||||||
|
import base64
|
||||||
|
import urllib.parse
|
||||||
|
|
||||||
|
from ._base import YoutubeBaseInfoExtractor
|
||||||
|
from ._tab import YoutubeTabIE
|
||||||
|
from ...utils import ExtractorError, classproperty, parse_qs, update_url_query, url_or_none
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeYtBeIE(YoutubeBaseInfoExtractor):
|
||||||
|
IE_DESC = 'youtu.be'
|
||||||
|
_VALID_URL = rf'https?://youtu\.be/(?P<id>[0-9A-Za-z_-]{{11}})/*?.*?\blist=(?P<playlist_id>{YoutubeBaseInfoExtractor._PLAYLIST_ID_RE})'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://youtu.be/yeWKywCrFtk?list=PL2qgrgXsNUG5ig9cat4ohreBjYLAPC0J5',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'yeWKywCrFtk',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Small Scale Baler and Braiding Rugs',
|
||||||
|
'uploader': 'Backus-Page House Museum',
|
||||||
|
'uploader_id': '@backuspagemuseum',
|
||||||
|
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@backuspagemuseum',
|
||||||
|
'upload_date': '20161008',
|
||||||
|
'description': 'md5:800c0c78d5eb128500bffd4f0b4f2e8a',
|
||||||
|
'categories': ['Nonprofits & Activism'],
|
||||||
|
'tags': list,
|
||||||
|
'like_count': int,
|
||||||
|
'age_limit': 0,
|
||||||
|
'playable_in_embed': True,
|
||||||
|
'thumbnail': r're:^https?://.*\.webp',
|
||||||
|
'channel': 'Backus-Page House Museum',
|
||||||
|
'channel_id': 'UCEfMCQ9bs3tjvjy1s451zaw',
|
||||||
|
'live_status': 'not_live',
|
||||||
|
'view_count': int,
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UCEfMCQ9bs3tjvjy1s451zaw',
|
||||||
|
'availability': 'public',
|
||||||
|
'duration': 59,
|
||||||
|
'comment_count': int,
|
||||||
|
'channel_follower_count': int,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'noplaylist': True,
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://youtu.be/uWyaPkt-VOI?list=PL9D9FC436B881BA21',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
mobj = self._match_valid_url(url)
|
||||||
|
video_id = mobj.group('id')
|
||||||
|
playlist_id = mobj.group('playlist_id')
|
||||||
|
return self.url_result(
|
||||||
|
update_url_query('https://www.youtube.com/watch', {
|
||||||
|
'v': video_id,
|
||||||
|
'list': playlist_id,
|
||||||
|
'feature': 'youtu.be',
|
||||||
|
}), ie=YoutubeTabIE.ie_key(), video_id=playlist_id)
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeLivestreamEmbedIE(YoutubeBaseInfoExtractor):
|
||||||
|
IE_DESC = 'YouTube livestream embeds'
|
||||||
|
_VALID_URL = r'https?://(?:\w+\.)?youtube\.com/embed/live_stream/?\?(?:[^#]+&)?channel=(?P<id>[^&#]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.youtube.com/embed/live_stream?channel=UC2_KI6RB__jGdlnK6dvFEZA',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
channel_id = self._match_id(url)
|
||||||
|
return self.url_result(
|
||||||
|
f'https://www.youtube.com/channel/{channel_id}/live',
|
||||||
|
ie=YoutubeTabIE.ie_key(), video_id=channel_id)
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeYtUserIE(YoutubeBaseInfoExtractor):
|
||||||
|
IE_DESC = 'YouTube user videos; "ytuser:" prefix'
|
||||||
|
IE_NAME = 'youtube:user'
|
||||||
|
_VALID_URL = r'ytuser:(?P<id>.+)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'ytuser:phihag',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
user_id = self._match_id(url)
|
||||||
|
return self.url_result(f'https://www.youtube.com/user/{user_id}', YoutubeTabIE, user_id)
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeFavouritesIE(YoutubeBaseInfoExtractor):
|
||||||
|
IE_NAME = 'youtube:favorites'
|
||||||
|
IE_DESC = 'YouTube liked videos; ":ytfav" keyword (requires cookies)'
|
||||||
|
_VALID_URL = r':ytfav(?:ou?rite)?s?'
|
||||||
|
_LOGIN_REQUIRED = True
|
||||||
|
_TESTS = [{
|
||||||
|
'url': ':ytfav',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': ':ytfavorites',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
return self.url_result(
|
||||||
|
'https://www.youtube.com/playlist?list=LL',
|
||||||
|
ie=YoutubeTabIE.ie_key())
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeFeedsInfoExtractor(YoutubeBaseInfoExtractor):
|
||||||
|
"""
|
||||||
|
Base class for feed extractors
|
||||||
|
Subclasses must re-define the _FEED_NAME property.
|
||||||
|
"""
|
||||||
|
_LOGIN_REQUIRED = True
|
||||||
|
_FEED_NAME = 'feeds'
|
||||||
|
|
||||||
|
@classproperty
|
||||||
|
def IE_NAME(cls):
|
||||||
|
return f'youtube:{cls._FEED_NAME}'
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
return self.url_result(
|
||||||
|
f'https://www.youtube.com/feed/{self._FEED_NAME}', ie=YoutubeTabIE.ie_key())
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeWatchLaterIE(YoutubeBaseInfoExtractor):
|
||||||
|
IE_NAME = 'youtube:watchlater'
|
||||||
|
IE_DESC = 'Youtube watch later list; ":ytwatchlater" keyword (requires cookies)'
|
||||||
|
_VALID_URL = r':ytwatchlater'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': ':ytwatchlater',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
return self.url_result(
|
||||||
|
'https://www.youtube.com/playlist?list=WL', ie=YoutubeTabIE.ie_key())
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeRecommendedIE(YoutubeFeedsInfoExtractor):
|
||||||
|
IE_DESC = 'YouTube recommended videos; ":ytrec" keyword'
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?youtube\.com/?(?:[?#]|$)|:ytrec(?:ommended)?'
|
||||||
|
_FEED_NAME = 'recommended'
|
||||||
|
_LOGIN_REQUIRED = False
|
||||||
|
_TESTS = [{
|
||||||
|
'url': ':ytrec',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': ':ytrecommended',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'https://youtube.com',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeSubscriptionsIE(YoutubeFeedsInfoExtractor):
|
||||||
|
IE_DESC = 'YouTube subscriptions feed; ":ytsubs" keyword (requires cookies)'
|
||||||
|
_VALID_URL = r':ytsub(?:scription)?s?'
|
||||||
|
_FEED_NAME = 'subscriptions'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': ':ytsubs',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': ':ytsubscriptions',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeHistoryIE(YoutubeFeedsInfoExtractor):
|
||||||
|
IE_DESC = 'Youtube watch history; ":ythis" keyword (requires cookies)'
|
||||||
|
_VALID_URL = r':ythis(?:tory)?'
|
||||||
|
_FEED_NAME = 'history'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': ':ythistory',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeShortsAudioPivotIE(YoutubeBaseInfoExtractor):
|
||||||
|
IE_DESC = 'YouTube Shorts audio pivot (Shorts using audio of a given video)'
|
||||||
|
IE_NAME = 'youtube:shorts:pivot:audio'
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?youtube\.com/source/(?P<id>[\w-]{11})/shorts'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.youtube.com/source/Lyj-MZSAA9o/shorts',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _generate_audio_pivot_params(video_id):
|
||||||
|
"""
|
||||||
|
Generates sfv_audio_pivot browse params for this video id
|
||||||
|
"""
|
||||||
|
pb_params = b'\xf2\x05+\n)\x12\'\n\x0b%b\x12\x0b%b\x1a\x0b%b' % ((video_id.encode(),) * 3)
|
||||||
|
return urllib.parse.quote(base64.b64encode(pb_params).decode())
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
return self.url_result(
|
||||||
|
f'https://www.youtube.com/feed/sfv_audio_pivot?bp={self._generate_audio_pivot_params(video_id)}',
|
||||||
|
ie=YoutubeTabIE)
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeConsentRedirectIE(YoutubeBaseInfoExtractor):
|
||||||
|
IE_NAME = 'youtube:consent'
|
||||||
|
IE_DESC = False # Do not list
|
||||||
|
_VALID_URL = r'https?://consent\.youtube\.com/m\?'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://consent.youtube.com/m?continue=https%3A%2F%2Fwww.youtube.com%2Flive%2FqVv6vCqciTM%3Fcbrd%3D1&gl=NL&m=0&pc=yt&hl=en&src=1',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'qVv6vCqciTM',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'age_limit': 0,
|
||||||
|
'uploader_id': '@sana_natori',
|
||||||
|
'comment_count': int,
|
||||||
|
'chapters': 'count:13',
|
||||||
|
'upload_date': '20221223',
|
||||||
|
'thumbnail': 'https://i.ytimg.com/vi/qVv6vCqciTM/maxresdefault.jpg',
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UCIdEIHpS0TdkqRkHL5OkLtA',
|
||||||
|
'uploader_url': 'https://www.youtube.com/@sana_natori',
|
||||||
|
'like_count': int,
|
||||||
|
'release_date': '20221223',
|
||||||
|
'tags': ['Vtuber', '月ノ美兎', '名取さな', 'にじさんじ', 'クリスマス', '3D配信'],
|
||||||
|
'title': '【 #インターネット女クリスマス 】3Dで歌ってはしゃぐインターネットの女たち【月ノ美兎/名取さな】',
|
||||||
|
'view_count': int,
|
||||||
|
'playable_in_embed': True,
|
||||||
|
'duration': 4438,
|
||||||
|
'availability': 'public',
|
||||||
|
'channel_follower_count': int,
|
||||||
|
'channel_id': 'UCIdEIHpS0TdkqRkHL5OkLtA',
|
||||||
|
'categories': ['Entertainment'],
|
||||||
|
'live_status': 'was_live',
|
||||||
|
'release_timestamp': 1671793345,
|
||||||
|
'channel': 'さなちゃんねる',
|
||||||
|
'description': 'md5:6aebf95cc4a1d731aebc01ad6cc9806d',
|
||||||
|
'uploader': 'さなちゃんねる',
|
||||||
|
'channel_is_verified': True,
|
||||||
|
'heatmap': 'count:100',
|
||||||
|
},
|
||||||
|
'add_ie': ['Youtube'],
|
||||||
|
'params': {'skip_download': 'Youtube'},
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
redirect_url = url_or_none(parse_qs(url).get('continue', [None])[-1])
|
||||||
|
if not redirect_url:
|
||||||
|
raise ExtractorError('Invalid cookie consent redirect URL', expected=True)
|
||||||
|
return self.url_result(redirect_url)
|
167
yt_dlp/extractor/youtube/_search.py
Normal file
167
yt_dlp/extractor/youtube/_search.py
Normal file
@ -0,0 +1,167 @@
|
|||||||
|
import urllib.parse
|
||||||
|
|
||||||
|
from ._tab import YoutubeTabBaseInfoExtractor
|
||||||
|
from ..common import SearchInfoExtractor
|
||||||
|
from ...utils import join_nonempty, parse_qs
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeSearchIE(YoutubeTabBaseInfoExtractor, SearchInfoExtractor):
|
||||||
|
IE_DESC = 'YouTube search'
|
||||||
|
IE_NAME = 'youtube:search'
|
||||||
|
_SEARCH_KEY = 'ytsearch'
|
||||||
|
_SEARCH_PARAMS = 'EgIQAfABAQ==' # Videos only
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'ytsearch5:youtube-dl test video',
|
||||||
|
'playlist_count': 5,
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'youtube-dl test video',
|
||||||
|
'title': 'youtube-dl test video',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'note': 'Suicide/self-harm search warning',
|
||||||
|
'url': 'ytsearch1:i hate myself and i wanna die',
|
||||||
|
'playlist_count': 1,
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'i hate myself and i wanna die',
|
||||||
|
'title': 'i hate myself and i wanna die',
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeSearchDateIE(YoutubeTabBaseInfoExtractor, SearchInfoExtractor):
|
||||||
|
IE_NAME = YoutubeSearchIE.IE_NAME + ':date'
|
||||||
|
_SEARCH_KEY = 'ytsearchdate'
|
||||||
|
IE_DESC = 'YouTube search, newest videos first'
|
||||||
|
_SEARCH_PARAMS = 'CAISAhAB8AEB' # Videos only, sorted by date
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'ytsearchdate5:youtube-dl test video',
|
||||||
|
'playlist_count': 5,
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'youtube-dl test video',
|
||||||
|
'title': 'youtube-dl test video',
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeSearchURLIE(YoutubeTabBaseInfoExtractor):
|
||||||
|
IE_DESC = 'YouTube search URLs with sorting and filter support'
|
||||||
|
IE_NAME = YoutubeSearchIE.IE_NAME + '_url'
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?youtube\.com/(?:results|search)\?([^#]+&)?(?:search_query|q)=(?:[^&]+)(?:[&#]|$)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://www.youtube.com/results?baz=bar&search_query=youtube-dl+test+video&filters=video&lclk=video',
|
||||||
|
'playlist_mincount': 5,
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'youtube-dl test video',
|
||||||
|
'title': 'youtube-dl test video',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.youtube.com/results?search_query=python&sp=EgIQAg%253D%253D',
|
||||||
|
'playlist_mincount': 5,
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'python',
|
||||||
|
'title': 'python',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.youtube.com/results?search_query=%23cats',
|
||||||
|
'playlist_mincount': 1,
|
||||||
|
'info_dict': {
|
||||||
|
'id': '#cats',
|
||||||
|
'title': '#cats',
|
||||||
|
# The test suite does not have support for nested playlists
|
||||||
|
# 'entries': [{
|
||||||
|
# 'url': r're:https://(www\.)?youtube\.com/hashtag/cats',
|
||||||
|
# 'title': '#cats',
|
||||||
|
# }],
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
# Channel results
|
||||||
|
'url': 'https://www.youtube.com/results?search_query=kurzgesagt&sp=EgIQAg%253D%253D',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'kurzgesagt',
|
||||||
|
'title': 'kurzgesagt',
|
||||||
|
},
|
||||||
|
'playlist': [{
|
||||||
|
'info_dict': {
|
||||||
|
'_type': 'url',
|
||||||
|
'id': 'UCsXVk37bltHxD1rDPwtNM8Q',
|
||||||
|
'url': 'https://www.youtube.com/channel/UCsXVk37bltHxD1rDPwtNM8Q',
|
||||||
|
'ie_key': 'YoutubeTab',
|
||||||
|
'channel': 'Kurzgesagt – In a Nutshell',
|
||||||
|
'description': 'md5:4ae48dfa9505ffc307dad26342d06bfc',
|
||||||
|
'title': 'Kurzgesagt – In a Nutshell',
|
||||||
|
'channel_id': 'UCsXVk37bltHxD1rDPwtNM8Q',
|
||||||
|
# No longer available for search as it is set to the handle.
|
||||||
|
# 'playlist_count': int,
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UCsXVk37bltHxD1rDPwtNM8Q',
|
||||||
|
'thumbnails': list,
|
||||||
|
'uploader_id': '@kurzgesagt',
|
||||||
|
'uploader_url': 'https://www.youtube.com/@kurzgesagt',
|
||||||
|
'uploader': 'Kurzgesagt – In a Nutshell',
|
||||||
|
'channel_is_verified': True,
|
||||||
|
'channel_follower_count': int,
|
||||||
|
},
|
||||||
|
}],
|
||||||
|
'params': {'extract_flat': True, 'playlist_items': '1'},
|
||||||
|
'playlist_mincount': 1,
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.youtube.com/results?q=test&sp=EgQIBBgB',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
qs = parse_qs(url)
|
||||||
|
query = (qs.get('search_query') or qs.get('q'))[0]
|
||||||
|
return self.playlist_result(self._search_results(query, qs.get('sp', (None,))[0]), query, query)
|
||||||
|
|
||||||
|
|
||||||
|
class YoutubeMusicSearchURLIE(YoutubeTabBaseInfoExtractor):
|
||||||
|
IE_DESC = 'YouTube music search URLs with selectable sections, e.g. #songs'
|
||||||
|
IE_NAME = 'youtube:music:search_url'
|
||||||
|
_VALID_URL = r'https?://music\.youtube\.com/search\?([^#]+&)?(?:search_query|q)=(?:[^&]+)(?:[&#]|$)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://music.youtube.com/search?q=royalty+free+music',
|
||||||
|
'playlist_count': 16,
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'royalty free music',
|
||||||
|
'title': 'royalty free music',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://music.youtube.com/search?q=royalty+free+music&sp=EgWKAQIIAWoKEAoQAxAEEAkQBQ%3D%3D',
|
||||||
|
'playlist_mincount': 30,
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'royalty free music - songs',
|
||||||
|
'title': 'royalty free music - songs',
|
||||||
|
},
|
||||||
|
'params': {'extract_flat': 'in_playlist'},
|
||||||
|
}, {
|
||||||
|
'url': 'https://music.youtube.com/search?q=royalty+free+music#community+playlists',
|
||||||
|
'playlist_mincount': 30,
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'royalty free music - community playlists',
|
||||||
|
'title': 'royalty free music - community playlists',
|
||||||
|
},
|
||||||
|
'params': {'extract_flat': 'in_playlist'},
|
||||||
|
}]
|
||||||
|
|
||||||
|
_SECTIONS = {
|
||||||
|
'albums': 'EgWKAQIYAWoKEAoQAxAEEAkQBQ==',
|
||||||
|
'artists': 'EgWKAQIgAWoKEAoQAxAEEAkQBQ==',
|
||||||
|
'community playlists': 'EgeKAQQoAEABagoQChADEAQQCRAF',
|
||||||
|
'featured playlists': 'EgeKAQQoADgBagwQAxAJEAQQDhAKEAU==',
|
||||||
|
'songs': 'EgWKAQIIAWoKEAoQAxAEEAkQBQ==',
|
||||||
|
'videos': 'EgWKAQIQAWoKEAoQAxAEEAkQBQ==',
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
qs = parse_qs(url)
|
||||||
|
query = (qs.get('search_query') or qs.get('q'))[0]
|
||||||
|
params = qs.get('sp', (None,))[0]
|
||||||
|
if params:
|
||||||
|
section = next((k for k, v in self._SECTIONS.items() if v == params), params)
|
||||||
|
else:
|
||||||
|
section = urllib.parse.unquote_plus(([*url.split('#'), ''])[1]).lower()
|
||||||
|
params = self._SECTIONS.get(section)
|
||||||
|
if not params:
|
||||||
|
section = None
|
||||||
|
title = join_nonempty(query, section, delim=' - ')
|
||||||
|
return self.playlist_result(self._search_results(query, params, default_client='web_music'), title, title)
|
2348
yt_dlp/extractor/youtube/_tab.py
Normal file
2348
yt_dlp/extractor/youtube/_tab.py
Normal file
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -137,6 +137,116 @@ def _extract_player(self, webpage, video_id, fatal=True):
|
|||||||
group='json'),
|
group='json'),
|
||||||
video_id)
|
video_id)
|
||||||
|
|
||||||
|
def _extract_entry(self, url, player, content, video_id):
|
||||||
|
title = content.get('title') or content['teaserHeadline']
|
||||||
|
|
||||||
|
t = content['mainVideoContent']['http://zdf.de/rels/target']
|
||||||
|
ptmd_path = traverse_obj(t, (
|
||||||
|
(('streams', 'default'), None),
|
||||||
|
('http://zdf.de/rels/streams/ptmd', 'http://zdf.de/rels/streams/ptmd-template'),
|
||||||
|
), get_all=False)
|
||||||
|
if not ptmd_path:
|
||||||
|
raise ExtractorError('Could not extract ptmd_path')
|
||||||
|
|
||||||
|
info = self._extract_ptmd(
|
||||||
|
urljoin(url, ptmd_path.replace('{playerId}', 'android_native_5')), video_id, player['apiToken'], url)
|
||||||
|
|
||||||
|
thumbnails = []
|
||||||
|
layouts = try_get(
|
||||||
|
content, lambda x: x['teaserImageRef']['layouts'], dict)
|
||||||
|
if layouts:
|
||||||
|
for layout_key, layout_url in layouts.items():
|
||||||
|
layout_url = url_or_none(layout_url)
|
||||||
|
if not layout_url:
|
||||||
|
continue
|
||||||
|
thumbnail = {
|
||||||
|
'url': layout_url,
|
||||||
|
'format_id': layout_key,
|
||||||
|
}
|
||||||
|
mobj = re.search(r'(?P<width>\d+)x(?P<height>\d+)', layout_key)
|
||||||
|
if mobj:
|
||||||
|
thumbnail.update({
|
||||||
|
'width': int(mobj.group('width')),
|
||||||
|
'height': int(mobj.group('height')),
|
||||||
|
})
|
||||||
|
thumbnails.append(thumbnail)
|
||||||
|
|
||||||
|
chapter_marks = t.get('streamAnchorTag') or []
|
||||||
|
chapter_marks.append({'anchorOffset': int_or_none(t.get('duration'))})
|
||||||
|
chapters = [{
|
||||||
|
'start_time': chap.get('anchorOffset'),
|
||||||
|
'end_time': next_chap.get('anchorOffset'),
|
||||||
|
'title': chap.get('anchorLabel'),
|
||||||
|
} for chap, next_chap in zip(chapter_marks, chapter_marks[1:])]
|
||||||
|
|
||||||
|
return merge_dicts(info, {
|
||||||
|
'title': title,
|
||||||
|
'description': content.get('leadParagraph') or content.get('teasertext'),
|
||||||
|
'duration': int_or_none(t.get('duration')),
|
||||||
|
'timestamp': unified_timestamp(content.get('editorialDate')),
|
||||||
|
'thumbnails': thumbnails,
|
||||||
|
'chapters': chapters or None,
|
||||||
|
'episode': title,
|
||||||
|
**traverse_obj(content, ('programmeItem', 0, 'http://zdf.de/rels/target', {
|
||||||
|
'series_id': ('http://zdf.de/rels/cmdm/series', 'seriesUuid', {str}),
|
||||||
|
'series': ('http://zdf.de/rels/cmdm/series', 'seriesTitle', {str}),
|
||||||
|
'season': ('http://zdf.de/rels/cmdm/season', 'seasonTitle', {str}),
|
||||||
|
'season_number': ('http://zdf.de/rels/cmdm/season', 'seasonNumber', {int_or_none}),
|
||||||
|
'season_id': ('http://zdf.de/rels/cmdm/season', 'seasonUuid', {str}),
|
||||||
|
'episode_number': ('episodeNumber', {int_or_none}),
|
||||||
|
'episode_id': ('contentId', {str}),
|
||||||
|
})),
|
||||||
|
})
|
||||||
|
|
||||||
|
def _extract_regular(self, url, player, video_id, query=None):
|
||||||
|
player_url = player['content']
|
||||||
|
|
||||||
|
content = self._call_api(
|
||||||
|
update_url_query(player_url, query),
|
||||||
|
video_id, 'content', player['apiToken'], url)
|
||||||
|
|
||||||
|
return self._extract_entry(player_url, player, content, video_id)
|
||||||
|
|
||||||
|
def _extract_mobile(self, video_id):
|
||||||
|
video = self._download_v2_doc(video_id)
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
formitaeten = try_get(video, lambda x: x['document']['formitaeten'], list)
|
||||||
|
document = formitaeten and video['document']
|
||||||
|
if formitaeten:
|
||||||
|
title = document['titel']
|
||||||
|
content_id = document['basename']
|
||||||
|
|
||||||
|
format_urls = set()
|
||||||
|
for f in formitaeten or []:
|
||||||
|
self._extract_format(content_id, formats, format_urls, f)
|
||||||
|
|
||||||
|
thumbnails = []
|
||||||
|
teaser_bild = document.get('teaserBild')
|
||||||
|
if isinstance(teaser_bild, dict):
|
||||||
|
for thumbnail_key, thumbnail in teaser_bild.items():
|
||||||
|
thumbnail_url = try_get(
|
||||||
|
thumbnail, lambda x: x['url'], str)
|
||||||
|
if thumbnail_url:
|
||||||
|
thumbnails.append({
|
||||||
|
'url': thumbnail_url,
|
||||||
|
'id': thumbnail_key,
|
||||||
|
'width': int_or_none(thumbnail.get('width')),
|
||||||
|
'height': int_or_none(thumbnail.get('height')),
|
||||||
|
})
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': content_id,
|
||||||
|
'title': title,
|
||||||
|
'description': document.get('beschreibung'),
|
||||||
|
'duration': int_or_none(document.get('length')),
|
||||||
|
'timestamp': unified_timestamp(document.get('date')) or unified_timestamp(
|
||||||
|
try_get(video, lambda x: x['meta']['editorialDate'], str)),
|
||||||
|
'thumbnails': thumbnails,
|
||||||
|
'subtitles': self._extract_subtitles(document),
|
||||||
|
'formats': formats,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
class ZDFIE(ZDFBaseIE):
|
class ZDFIE(ZDFBaseIE):
|
||||||
_VALID_URL = r'https?://www\.zdf\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html'
|
_VALID_URL = r'https?://www\.zdf\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html'
|
||||||
@ -187,12 +297,20 @@ class ZDFIE(ZDFBaseIE):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '151025_magie_farben2_tex',
|
'id': '151025_magie_farben2_tex',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
|
'duration': 2615.0,
|
||||||
'title': 'Die Magie der Farben (2/2)',
|
'title': 'Die Magie der Farben (2/2)',
|
||||||
'description': 'md5:a89da10c928c6235401066b60a6d5c1a',
|
'description': 'md5:a89da10c928c6235401066b60a6d5c1a',
|
||||||
'duration': 2615,
|
|
||||||
'timestamp': 1465021200,
|
'timestamp': 1465021200,
|
||||||
'upload_date': '20160604',
|
|
||||||
'thumbnail': 'https://www.zdf.de/assets/mauve-im-labor-100~768x432?cb=1464909117806',
|
'thumbnail': 'https://www.zdf.de/assets/mauve-im-labor-100~768x432?cb=1464909117806',
|
||||||
|
'upload_date': '20160604',
|
||||||
|
'episode': 'Die Magie der Farben (2/2)',
|
||||||
|
'episode_id': 'POS_954f4170-36a5-4a41-a6cf-78f1f3b1f127',
|
||||||
|
'season': 'Staffel 1',
|
||||||
|
'series': 'Die Magie der Farben',
|
||||||
|
'season_number': 1,
|
||||||
|
'series_id': 'a39900dd-cdbd-4a6a-a413-44e8c6ae18bc',
|
||||||
|
'season_id': '5a92e619-8a0f-4410-a3d5-19c76fbebb37',
|
||||||
|
'episode_number': 2,
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.zdf.de/funk/druck-11790/funk-alles-ist-verzaubert-102.html',
|
'url': 'https://www.zdf.de/funk/druck-11790/funk-alles-ist-verzaubert-102.html',
|
||||||
@ -200,12 +318,13 @@ class ZDFIE(ZDFBaseIE):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'id': 'video_funk_1770473',
|
'id': 'video_funk_1770473',
|
||||||
'duration': 1278,
|
'duration': 1278.0,
|
||||||
'description': 'Die Neue an der Schule verdreht Ismail den Kopf.',
|
|
||||||
'title': 'Alles ist verzaubert',
|
'title': 'Alles ist verzaubert',
|
||||||
|
'description': 'Die Neue an der Schule verdreht Ismail den Kopf.',
|
||||||
'timestamp': 1635520560,
|
'timestamp': 1635520560,
|
||||||
'upload_date': '20211029',
|
|
||||||
'thumbnail': 'https://www.zdf.de/assets/teaser-funk-alles-ist-verzaubert-102~1920x1080?cb=1663848412907',
|
'thumbnail': 'https://www.zdf.de/assets/teaser-funk-alles-ist-verzaubert-102~1920x1080?cb=1663848412907',
|
||||||
|
'upload_date': '20211029',
|
||||||
|
'episode': 'Alles ist verzaubert',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
# Same as https://www.phoenix.de/sendungen/dokumentationen/gesten-der-maechtigen-i-a-89468.html?ref=suche
|
# Same as https://www.phoenix.de/sendungen/dokumentationen/gesten-der-maechtigen-i-a-89468.html?ref=suche
|
||||||
@ -248,121 +367,55 @@ class ZDFIE(ZDFBaseIE):
|
|||||||
'title': 'Das Geld anderer Leute',
|
'title': 'Das Geld anderer Leute',
|
||||||
'description': 'md5:cb6f660850dc5eb7d1ab776ea094959d',
|
'description': 'md5:cb6f660850dc5eb7d1ab776ea094959d',
|
||||||
'duration': 2581.0,
|
'duration': 2581.0,
|
||||||
'timestamp': 1675160100,
|
'timestamp': 1728983700,
|
||||||
'upload_date': '20230131',
|
'upload_date': '20241015',
|
||||||
'thumbnail': 'https://epg-image.zdf.de/fotobase-webdelivery/images/e2d7e55a-09f0-424e-ac73-6cac4dd65f35?layout=2400x1350',
|
'thumbnail': 'https://epg-image.zdf.de/fotobase-webdelivery/images/e2d7e55a-09f0-424e-ac73-6cac4dd65f35?layout=2400x1350',
|
||||||
|
'series': 'SOKO Stuttgart',
|
||||||
|
'series_id': 'f862ce9a-6dd1-4388-a698-22b36ac4c9e9',
|
||||||
|
'season': 'Staffel 11',
|
||||||
|
'season_number': 11,
|
||||||
|
'season_id': 'ae1b4990-6d87-4970-a571-caccf1ba2879',
|
||||||
|
'episode': 'Das Geld anderer Leute',
|
||||||
|
'episode_number': 10,
|
||||||
|
'episode_id': 'POS_7f367934-f2f0-45cb-9081-736781ff2d23',
|
||||||
},
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'https://www.zdf.de/dokumentation/terra-x/unser-gruener-planet-wuesten-doku-100.html',
|
'url': 'https://www.zdf.de/dokumentation/terra-x/unser-gruener-planet-wuesten-doku-100.html',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '220605_dk_gruener_planet_wuesten_tex',
|
'id': '220525_green_planet_makingof_1_tropen_tex',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Unser grüner Planet - Wüsten',
|
'title': 'Making-of Unser grüner Planet - Tropen',
|
||||||
'description': 'md5:4fc647b6f9c3796eea66f4a0baea2862',
|
'description': 'md5:d7c6949dc7c75c73c4ad51c785fb0b79',
|
||||||
'duration': 2613.0,
|
'duration': 435.0,
|
||||||
'timestamp': 1654450200,
|
'timestamp': 1653811200,
|
||||||
'upload_date': '20220605',
|
'upload_date': '20220529',
|
||||||
'format_note': 'uhd, main',
|
'format_note': 'hd, main',
|
||||||
'thumbnail': 'https://www.zdf.de/assets/saguaro-kakteen-102~3840x2160?cb=1655910690796',
|
'thumbnail': 'https://www.zdf.de/assets/unser-gruener-planet-making-of-1-tropen-100~3840x2160?cb=1653493335577',
|
||||||
|
'episode': 'Making-of Unser grüner Planet - Tropen',
|
||||||
|
},
|
||||||
|
'skip': 'No longer available: "Leider kein Video verfügbar"',
|
||||||
|
}, {
|
||||||
|
'url': 'https://www.zdf.de/serien/northern-lights/begegnung-auf-der-bruecke-100.html',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '240319_2310_sendung_not',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Begegnung auf der Brücke',
|
||||||
|
'description': 'md5:e53a555da87447f7f1207f10353f8e45',
|
||||||
|
'thumbnail': 'https://epg-image.zdf.de/fotobase-webdelivery/images/c5ff1d1f-f5c8-4468-86ac-1b2f1dbecc76?layout=2400x1350',
|
||||||
|
'upload_date': '20250203',
|
||||||
|
'duration': 3083.0,
|
||||||
|
'timestamp': 1738546500,
|
||||||
|
'series_id': '1d7a1879-01ee-4468-8237-c6b4ecd633c7',
|
||||||
|
'series': 'Northern Lights',
|
||||||
|
'season': 'Staffel 1',
|
||||||
|
'season_number': 1,
|
||||||
|
'season_id': '22ac26a2-4ea2-4055-ac0b-98b755cdf718',
|
||||||
|
'episode': 'Begegnung auf der Brücke',
|
||||||
|
'episode_number': 1,
|
||||||
|
'episode_id': 'POS_71049438-024b-471f-b472-4fe2e490d1fb',
|
||||||
},
|
},
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _extract_entry(self, url, player, content, video_id):
|
|
||||||
title = content.get('title') or content['teaserHeadline']
|
|
||||||
|
|
||||||
t = content['mainVideoContent']['http://zdf.de/rels/target']
|
|
||||||
ptmd_path = traverse_obj(t, (
|
|
||||||
(('streams', 'default'), None),
|
|
||||||
('http://zdf.de/rels/streams/ptmd', 'http://zdf.de/rels/streams/ptmd-template'),
|
|
||||||
), get_all=False)
|
|
||||||
if not ptmd_path:
|
|
||||||
raise ExtractorError('Could not extract ptmd_path')
|
|
||||||
|
|
||||||
info = self._extract_ptmd(
|
|
||||||
urljoin(url, ptmd_path.replace('{playerId}', 'android_native_5')), video_id, player['apiToken'], url)
|
|
||||||
|
|
||||||
thumbnails = []
|
|
||||||
layouts = try_get(
|
|
||||||
content, lambda x: x['teaserImageRef']['layouts'], dict)
|
|
||||||
if layouts:
|
|
||||||
for layout_key, layout_url in layouts.items():
|
|
||||||
layout_url = url_or_none(layout_url)
|
|
||||||
if not layout_url:
|
|
||||||
continue
|
|
||||||
thumbnail = {
|
|
||||||
'url': layout_url,
|
|
||||||
'format_id': layout_key,
|
|
||||||
}
|
|
||||||
mobj = re.search(r'(?P<width>\d+)x(?P<height>\d+)', layout_key)
|
|
||||||
if mobj:
|
|
||||||
thumbnail.update({
|
|
||||||
'width': int(mobj.group('width')),
|
|
||||||
'height': int(mobj.group('height')),
|
|
||||||
})
|
|
||||||
thumbnails.append(thumbnail)
|
|
||||||
|
|
||||||
chapter_marks = t.get('streamAnchorTag') or []
|
|
||||||
chapter_marks.append({'anchorOffset': int_or_none(t.get('duration'))})
|
|
||||||
chapters = [{
|
|
||||||
'start_time': chap.get('anchorOffset'),
|
|
||||||
'end_time': next_chap.get('anchorOffset'),
|
|
||||||
'title': chap.get('anchorLabel'),
|
|
||||||
} for chap, next_chap in zip(chapter_marks, chapter_marks[1:])]
|
|
||||||
|
|
||||||
return merge_dicts(info, {
|
|
||||||
'title': title,
|
|
||||||
'description': content.get('leadParagraph') or content.get('teasertext'),
|
|
||||||
'duration': int_or_none(t.get('duration')),
|
|
||||||
'timestamp': unified_timestamp(content.get('editorialDate')),
|
|
||||||
'thumbnails': thumbnails,
|
|
||||||
'chapters': chapters or None,
|
|
||||||
})
|
|
||||||
|
|
||||||
def _extract_regular(self, url, player, video_id):
|
|
||||||
content = self._call_api(
|
|
||||||
player['content'], video_id, 'content', player['apiToken'], url)
|
|
||||||
return self._extract_entry(player['content'], player, content, video_id)
|
|
||||||
|
|
||||||
def _extract_mobile(self, video_id):
|
|
||||||
video = self._download_v2_doc(video_id)
|
|
||||||
|
|
||||||
formats = []
|
|
||||||
formitaeten = try_get(video, lambda x: x['document']['formitaeten'], list)
|
|
||||||
document = formitaeten and video['document']
|
|
||||||
if formitaeten:
|
|
||||||
title = document['titel']
|
|
||||||
content_id = document['basename']
|
|
||||||
|
|
||||||
format_urls = set()
|
|
||||||
for f in formitaeten or []:
|
|
||||||
self._extract_format(content_id, formats, format_urls, f)
|
|
||||||
|
|
||||||
thumbnails = []
|
|
||||||
teaser_bild = document.get('teaserBild')
|
|
||||||
if isinstance(teaser_bild, dict):
|
|
||||||
for thumbnail_key, thumbnail in teaser_bild.items():
|
|
||||||
thumbnail_url = try_get(
|
|
||||||
thumbnail, lambda x: x['url'], str)
|
|
||||||
if thumbnail_url:
|
|
||||||
thumbnails.append({
|
|
||||||
'url': thumbnail_url,
|
|
||||||
'id': thumbnail_key,
|
|
||||||
'width': int_or_none(thumbnail.get('width')),
|
|
||||||
'height': int_or_none(thumbnail.get('height')),
|
|
||||||
})
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': content_id,
|
|
||||||
'title': title,
|
|
||||||
'description': document.get('beschreibung'),
|
|
||||||
'duration': int_or_none(document.get('length')),
|
|
||||||
'timestamp': unified_timestamp(document.get('date')) or unified_timestamp(
|
|
||||||
try_get(video, lambda x: x['meta']['editorialDate'], str)),
|
|
||||||
'thumbnails': thumbnails,
|
|
||||||
'subtitles': self._extract_subtitles(document),
|
|
||||||
'formats': formats,
|
|
||||||
}
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
@ -370,7 +423,7 @@ def _real_extract(self, url):
|
|||||||
if webpage:
|
if webpage:
|
||||||
player = self._extract_player(webpage, url, fatal=False)
|
player = self._extract_player(webpage, url, fatal=False)
|
||||||
if player:
|
if player:
|
||||||
return self._extract_regular(url, player, video_id)
|
return self._extract_regular(url, player, video_id, query={'profile': 'player-3'})
|
||||||
|
|
||||||
return self._extract_mobile(video_id)
|
return self._extract_mobile(video_id)
|
||||||
|
|
||||||
@ -416,7 +469,8 @@ def _extract_entry(self, entry):
|
|||||||
'title': ('titel', {str}),
|
'title': ('titel', {str}),
|
||||||
'description': ('beschreibung', {str}),
|
'description': ('beschreibung', {str}),
|
||||||
'duration': ('length', {float_or_none}),
|
'duration': ('length', {float_or_none}),
|
||||||
# TODO: seasonNumber and episodeNumber can be extracted but need to also be in ZDFIE
|
'season_number': ('seasonNumber', {int_or_none}),
|
||||||
|
'episode_number': ('episodeNumber', {int_or_none}),
|
||||||
}))
|
}))
|
||||||
|
|
||||||
def _entries(self, data, document_id):
|
def _entries(self, data, document_id):
|
||||||
|
30
yt_dlp/globals.py
Normal file
30
yt_dlp/globals.py
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
from collections import defaultdict
|
||||||
|
|
||||||
|
# Please Note: Due to necessary changes and the complex nature involved in the plugin/globals system,
|
||||||
|
# no backwards compatibility is guaranteed for the plugin system API.
|
||||||
|
# However, we will still try our best.
|
||||||
|
|
||||||
|
|
||||||
|
class Indirect:
|
||||||
|
def __init__(self, initial, /):
|
||||||
|
self.value = initial
|
||||||
|
|
||||||
|
def __repr__(self, /):
|
||||||
|
return f'{type(self).__name__}({self.value!r})'
|
||||||
|
|
||||||
|
|
||||||
|
postprocessors = Indirect({})
|
||||||
|
extractors = Indirect({})
|
||||||
|
|
||||||
|
# Plugins
|
||||||
|
all_plugins_loaded = Indirect(False)
|
||||||
|
plugin_specs = Indirect({})
|
||||||
|
plugin_dirs = Indirect(['default'])
|
||||||
|
|
||||||
|
plugin_ies = Indirect({})
|
||||||
|
plugin_pps = Indirect({})
|
||||||
|
plugin_ies_overrides = Indirect(defaultdict(list))
|
||||||
|
|
||||||
|
# Misc
|
||||||
|
IN_CLI = Indirect(False)
|
||||||
|
LAZY_EXTRACTORS = Indirect(None) # `False`=force, `None`=disabled, `True`=enabled
|
@ -25,7 +25,7 @@ def zeroise(x):
|
|||||||
with contextlib.suppress(TypeError):
|
with contextlib.suppress(TypeError):
|
||||||
if math.isnan(x): # NB: NaN cannot be checked by membership
|
if math.isnan(x): # NB: NaN cannot be checked by membership
|
||||||
return 0
|
return 0
|
||||||
return x
|
return int(float(x))
|
||||||
|
|
||||||
def wrapped(a, b):
|
def wrapped(a, b):
|
||||||
return op(zeroise(a), zeroise(b)) & 0xffffffff
|
return op(zeroise(a), zeroise(b)) & 0xffffffff
|
||||||
@ -95,6 +95,61 @@ def _js_ternary(cndn, if_true=True, if_false=False):
|
|||||||
return if_true
|
return if_true
|
||||||
|
|
||||||
|
|
||||||
|
# Ref: https://es5.github.io/#x9.8.1
|
||||||
|
def js_number_to_string(val: float, radix: int = 10):
|
||||||
|
if radix in (JS_Undefined, None):
|
||||||
|
radix = 10
|
||||||
|
assert radix in range(2, 37), 'radix must be an integer at least 2 and no greater than 36'
|
||||||
|
|
||||||
|
if math.isnan(val):
|
||||||
|
return 'NaN'
|
||||||
|
if val == 0:
|
||||||
|
return '0'
|
||||||
|
if math.isinf(val):
|
||||||
|
return '-Infinity' if val < 0 else 'Infinity'
|
||||||
|
if radix == 10:
|
||||||
|
# TODO: implement special cases
|
||||||
|
...
|
||||||
|
|
||||||
|
ALPHABET = b'0123456789abcdefghijklmnopqrstuvwxyz.-'
|
||||||
|
|
||||||
|
result = collections.deque()
|
||||||
|
sign = val < 0
|
||||||
|
val = abs(val)
|
||||||
|
fraction, integer = math.modf(val)
|
||||||
|
delta = max(math.nextafter(.0, math.inf), math.ulp(val) / 2)
|
||||||
|
|
||||||
|
if fraction >= delta:
|
||||||
|
result.append(-2) # `.`
|
||||||
|
while fraction >= delta:
|
||||||
|
delta *= radix
|
||||||
|
fraction, digit = math.modf(fraction * radix)
|
||||||
|
result.append(int(digit))
|
||||||
|
# if we need to round, propagate potential carry through fractional part
|
||||||
|
needs_rounding = fraction > 0.5 or (fraction == 0.5 and int(digit) & 1)
|
||||||
|
if needs_rounding and fraction + delta > 1:
|
||||||
|
for index in reversed(range(1, len(result))):
|
||||||
|
if result[index] + 1 < radix:
|
||||||
|
result[index] += 1
|
||||||
|
break
|
||||||
|
result.pop()
|
||||||
|
|
||||||
|
else:
|
||||||
|
integer += 1
|
||||||
|
break
|
||||||
|
|
||||||
|
integer, digit = divmod(int(integer), radix)
|
||||||
|
result.appendleft(digit)
|
||||||
|
while integer > 0:
|
||||||
|
integer, digit = divmod(integer, radix)
|
||||||
|
result.appendleft(digit)
|
||||||
|
|
||||||
|
if sign:
|
||||||
|
result.appendleft(-1) # `-`
|
||||||
|
|
||||||
|
return bytes(ALPHABET[digit] for digit in result).decode('ascii')
|
||||||
|
|
||||||
|
|
||||||
# Ref: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence
|
# Ref: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence
|
||||||
_OPERATORS = { # None => Defined in JSInterpreter._operator
|
_OPERATORS = { # None => Defined in JSInterpreter._operator
|
||||||
'?': None,
|
'?': None,
|
||||||
|
@ -296,6 +296,7 @@ def _check_extensions(self, extensions):
|
|||||||
extensions.pop('cookiejar', None)
|
extensions.pop('cookiejar', None)
|
||||||
extensions.pop('timeout', None)
|
extensions.pop('timeout', None)
|
||||||
extensions.pop('legacy_ssl', None)
|
extensions.pop('legacy_ssl', None)
|
||||||
|
extensions.pop('keep_header_casing', None)
|
||||||
|
|
||||||
def _create_instance(self, cookiejar, legacy_ssl_support=None):
|
def _create_instance(self, cookiejar, legacy_ssl_support=None):
|
||||||
session = RequestsSession()
|
session = RequestsSession()
|
||||||
@ -312,11 +313,12 @@ def _create_instance(self, cookiejar, legacy_ssl_support=None):
|
|||||||
session.trust_env = False # no need, we already load proxies from env
|
session.trust_env = False # no need, we already load proxies from env
|
||||||
return session
|
return session
|
||||||
|
|
||||||
def _send(self, request):
|
def _prepare_headers(self, _, headers):
|
||||||
|
|
||||||
headers = self._merge_headers(request.headers)
|
|
||||||
add_accept_encoding_header(headers, SUPPORTED_ENCODINGS)
|
add_accept_encoding_header(headers, SUPPORTED_ENCODINGS)
|
||||||
|
|
||||||
|
def _send(self, request):
|
||||||
|
|
||||||
|
headers = self._get_headers(request)
|
||||||
max_redirects_exceeded = False
|
max_redirects_exceeded = False
|
||||||
|
|
||||||
session = self._get_instance(
|
session = self._get_instance(
|
||||||
|
@ -379,13 +379,15 @@ def _create_instance(self, proxies, cookiejar, legacy_ssl_support=None):
|
|||||||
opener.addheaders = []
|
opener.addheaders = []
|
||||||
return opener
|
return opener
|
||||||
|
|
||||||
def _send(self, request):
|
def _prepare_headers(self, _, headers):
|
||||||
headers = self._merge_headers(request.headers)
|
|
||||||
add_accept_encoding_header(headers, SUPPORTED_ENCODINGS)
|
add_accept_encoding_header(headers, SUPPORTED_ENCODINGS)
|
||||||
|
|
||||||
|
def _send(self, request):
|
||||||
|
headers = self._get_headers(request)
|
||||||
urllib_req = urllib.request.Request(
|
urllib_req = urllib.request.Request(
|
||||||
url=request.url,
|
url=request.url,
|
||||||
data=request.data,
|
data=request.data,
|
||||||
headers=dict(headers),
|
headers=headers,
|
||||||
method=request.method,
|
method=request.method,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -116,6 +116,7 @@ def _check_extensions(self, extensions):
|
|||||||
extensions.pop('timeout', None)
|
extensions.pop('timeout', None)
|
||||||
extensions.pop('cookiejar', None)
|
extensions.pop('cookiejar', None)
|
||||||
extensions.pop('legacy_ssl', None)
|
extensions.pop('legacy_ssl', None)
|
||||||
|
extensions.pop('keep_header_casing', None)
|
||||||
|
|
||||||
def close(self):
|
def close(self):
|
||||||
# Remove the logging handler that contains a reference to our logger
|
# Remove the logging handler that contains a reference to our logger
|
||||||
@ -123,15 +124,16 @@ def close(self):
|
|||||||
for name, handler in self.__logging_handlers.items():
|
for name, handler in self.__logging_handlers.items():
|
||||||
logging.getLogger(name).removeHandler(handler)
|
logging.getLogger(name).removeHandler(handler)
|
||||||
|
|
||||||
def _send(self, request):
|
def _prepare_headers(self, request, headers):
|
||||||
timeout = self._calculate_timeout(request)
|
|
||||||
headers = self._merge_headers(request.headers)
|
|
||||||
if 'cookie' not in headers:
|
if 'cookie' not in headers:
|
||||||
cookiejar = self._get_cookiejar(request)
|
cookiejar = self._get_cookiejar(request)
|
||||||
cookie_header = cookiejar.get_cookie_header(request.url)
|
cookie_header = cookiejar.get_cookie_header(request.url)
|
||||||
if cookie_header:
|
if cookie_header:
|
||||||
headers['cookie'] = cookie_header
|
headers['cookie'] = cookie_header
|
||||||
|
|
||||||
|
def _send(self, request):
|
||||||
|
timeout = self._calculate_timeout(request)
|
||||||
|
headers = self._get_headers(request)
|
||||||
wsuri = parse_uri(request.url)
|
wsuri = parse_uri(request.url)
|
||||||
create_conn_kwargs = {
|
create_conn_kwargs = {
|
||||||
'source_address': (self.source_address, 0) if self.source_address else None,
|
'source_address': (self.source_address, 0) if self.source_address else None,
|
||||||
|
@ -206,6 +206,7 @@ class RequestHandler(abc.ABC):
|
|||||||
- `cookiejar`: Cookiejar to use for this request.
|
- `cookiejar`: Cookiejar to use for this request.
|
||||||
- `timeout`: socket timeout to use for this request.
|
- `timeout`: socket timeout to use for this request.
|
||||||
- `legacy_ssl`: Enable legacy SSL options for this request. See legacy_ssl_support.
|
- `legacy_ssl`: Enable legacy SSL options for this request. See legacy_ssl_support.
|
||||||
|
- `keep_header_casing`: Keep the casing of headers when sending the request.
|
||||||
To enable these, add extensions.pop('<extension>', None) to _check_extensions
|
To enable these, add extensions.pop('<extension>', None) to _check_extensions
|
||||||
|
|
||||||
Apart from the url protocol, proxies dict may contain the following keys:
|
Apart from the url protocol, proxies dict may contain the following keys:
|
||||||
@ -259,6 +260,23 @@ def _make_sslcontext(self, legacy_ssl_support=None):
|
|||||||
def _merge_headers(self, request_headers):
|
def _merge_headers(self, request_headers):
|
||||||
return HTTPHeaderDict(self.headers, request_headers)
|
return HTTPHeaderDict(self.headers, request_headers)
|
||||||
|
|
||||||
|
def _prepare_headers(self, request: Request, headers: HTTPHeaderDict) -> None: # noqa: B027
|
||||||
|
"""Additional operations to prepare headers before building. To be extended by subclasses.
|
||||||
|
@param request: Request object
|
||||||
|
@param headers: Merged headers to prepare
|
||||||
|
"""
|
||||||
|
|
||||||
|
def _get_headers(self, request: Request) -> dict[str, str]:
|
||||||
|
"""
|
||||||
|
Get headers for external use.
|
||||||
|
Subclasses may define a _prepare_headers method to modify headers after merge but before building.
|
||||||
|
"""
|
||||||
|
headers = self._merge_headers(request.headers)
|
||||||
|
self._prepare_headers(request, headers)
|
||||||
|
if request.extensions.get('keep_header_casing'):
|
||||||
|
return headers.sensitive()
|
||||||
|
return dict(headers)
|
||||||
|
|
||||||
def _calculate_timeout(self, request):
|
def _calculate_timeout(self, request):
|
||||||
return float(request.extensions.get('timeout') or self.timeout)
|
return float(request.extensions.get('timeout') or self.timeout)
|
||||||
|
|
||||||
@ -317,6 +335,7 @@ def _check_extensions(self, extensions):
|
|||||||
assert isinstance(extensions.get('cookiejar'), (YoutubeDLCookieJar, NoneType))
|
assert isinstance(extensions.get('cookiejar'), (YoutubeDLCookieJar, NoneType))
|
||||||
assert isinstance(extensions.get('timeout'), (float, int, NoneType))
|
assert isinstance(extensions.get('timeout'), (float, int, NoneType))
|
||||||
assert isinstance(extensions.get('legacy_ssl'), (bool, NoneType))
|
assert isinstance(extensions.get('legacy_ssl'), (bool, NoneType))
|
||||||
|
assert isinstance(extensions.get('keep_header_casing'), (bool, NoneType))
|
||||||
|
|
||||||
def _validate(self, request):
|
def _validate(self, request):
|
||||||
self._check_url_scheme(request)
|
self._check_url_scheme(request)
|
||||||
|
@ -5,11 +5,11 @@
|
|||||||
from dataclasses import dataclass
|
from dataclasses import dataclass
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from .common import RequestHandler, register_preference
|
from .common import RequestHandler, register_preference, Request
|
||||||
from .exceptions import UnsupportedRequest
|
from .exceptions import UnsupportedRequest
|
||||||
from ..compat.types import NoneType
|
from ..compat.types import NoneType
|
||||||
from ..utils import classproperty, join_nonempty
|
from ..utils import classproperty, join_nonempty
|
||||||
from ..utils.networking import std_headers
|
from ..utils.networking import std_headers, HTTPHeaderDict
|
||||||
|
|
||||||
|
|
||||||
@dataclass(order=True, frozen=True)
|
@dataclass(order=True, frozen=True)
|
||||||
@ -123,7 +123,17 @@ def _get_request_target(self, request):
|
|||||||
"""Get the requested target for the request"""
|
"""Get the requested target for the request"""
|
||||||
return self._resolve_target(request.extensions.get('impersonate') or self.impersonate)
|
return self._resolve_target(request.extensions.get('impersonate') or self.impersonate)
|
||||||
|
|
||||||
def _get_impersonate_headers(self, request):
|
def _prepare_impersonate_headers(self, request: Request, headers: HTTPHeaderDict) -> None: # noqa: B027
|
||||||
|
"""Additional operations to prepare headers before building. To be extended by subclasses.
|
||||||
|
@param request: Request object
|
||||||
|
@param headers: Merged headers to prepare
|
||||||
|
"""
|
||||||
|
|
||||||
|
def _get_impersonate_headers(self, request: Request) -> dict[str, str]:
|
||||||
|
"""
|
||||||
|
Get headers for external impersonation use.
|
||||||
|
Subclasses may define a _prepare_impersonate_headers method to modify headers after merge but before building.
|
||||||
|
"""
|
||||||
headers = self._merge_headers(request.headers)
|
headers = self._merge_headers(request.headers)
|
||||||
if self._get_request_target(request) is not None:
|
if self._get_request_target(request) is not None:
|
||||||
# remove all headers present in std_headers
|
# remove all headers present in std_headers
|
||||||
@ -131,7 +141,11 @@ def _get_impersonate_headers(self, request):
|
|||||||
for k, v in std_headers.items():
|
for k, v in std_headers.items():
|
||||||
if headers.get(k) == v:
|
if headers.get(k) == v:
|
||||||
headers.pop(k)
|
headers.pop(k)
|
||||||
return headers
|
|
||||||
|
self._prepare_impersonate_headers(request, headers)
|
||||||
|
if request.extensions.get('keep_header_casing'):
|
||||||
|
return headers.sensitive()
|
||||||
|
return dict(headers)
|
||||||
|
|
||||||
|
|
||||||
@register_preference(ImpersonateRequestHandler)
|
@register_preference(ImpersonateRequestHandler)
|
||||||
|
@ -398,7 +398,7 @@ def _alias_callback(option, opt_str, value, parser, opts, nargs):
|
|||||||
'(Alias: --no-config)'))
|
'(Alias: --no-config)'))
|
||||||
general.add_option(
|
general.add_option(
|
||||||
'--no-config-locations',
|
'--no-config-locations',
|
||||||
action='store_const', dest='config_locations', const=[],
|
action='store_const', dest='config_locations', const=None,
|
||||||
help=(
|
help=(
|
||||||
'Do not load any custom configuration files (default). When given inside a '
|
'Do not load any custom configuration files (default). When given inside a '
|
||||||
'configuration file, ignore all previous --config-locations defined in the current file'))
|
'configuration file, ignore all previous --config-locations defined in the current file'))
|
||||||
@ -410,12 +410,21 @@ def _alias_callback(option, opt_str, value, parser, opts, nargs):
|
|||||||
'("-" for stdin). Can be used multiple times and inside other configuration files'))
|
'("-" for stdin). Can be used multiple times and inside other configuration files'))
|
||||||
general.add_option(
|
general.add_option(
|
||||||
'--plugin-dirs',
|
'--plugin-dirs',
|
||||||
dest='plugin_dirs', metavar='PATH', action='append',
|
metavar='PATH',
|
||||||
|
dest='plugin_dirs',
|
||||||
|
action='callback',
|
||||||
|
callback=_list_from_options_callback,
|
||||||
|
type='str',
|
||||||
|
callback_kwargs={'delim': None},
|
||||||
|
default=['default'],
|
||||||
help=(
|
help=(
|
||||||
'Path to an additional directory to search for plugins. '
|
'Path to an additional directory to search for plugins. '
|
||||||
'This option can be used multiple times to add multiple directories. '
|
'This option can be used multiple times to add multiple directories. '
|
||||||
'Note that this currently only works for extractor plugins; '
|
'Use "default" to search the default plugin directories (default)'))
|
||||||
'postprocessor plugins can only be loaded from the default plugin directories'))
|
general.add_option(
|
||||||
|
'--no-plugin-dirs',
|
||||||
|
dest='plugin_dirs', action='store_const', const=[],
|
||||||
|
help='Clear plugin directories to search, including defaults and those provided by previous --plugin-dirs')
|
||||||
general.add_option(
|
general.add_option(
|
||||||
'--flat-playlist',
|
'--flat-playlist',
|
||||||
action='store_const', dest='extract_flat', const='in_playlist', default=False,
|
action='store_const', dest='extract_flat', const='in_playlist', default=False,
|
||||||
|
@ -1,4 +1,5 @@
|
|||||||
import contextlib
|
import contextlib
|
||||||
|
import dataclasses
|
||||||
import functools
|
import functools
|
||||||
import importlib
|
import importlib
|
||||||
import importlib.abc
|
import importlib.abc
|
||||||
@ -14,17 +15,48 @@
|
|||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from zipfile import ZipFile
|
from zipfile import ZipFile
|
||||||
|
|
||||||
|
from .globals import (
|
||||||
|
Indirect,
|
||||||
|
plugin_dirs,
|
||||||
|
all_plugins_loaded,
|
||||||
|
plugin_specs,
|
||||||
|
)
|
||||||
|
|
||||||
from .utils import (
|
from .utils import (
|
||||||
Config,
|
|
||||||
get_executable_path,
|
get_executable_path,
|
||||||
get_system_config_dirs,
|
get_system_config_dirs,
|
||||||
get_user_config_dirs,
|
get_user_config_dirs,
|
||||||
|
merge_dicts,
|
||||||
orderedSet,
|
orderedSet,
|
||||||
write_string,
|
write_string,
|
||||||
)
|
)
|
||||||
|
|
||||||
PACKAGE_NAME = 'yt_dlp_plugins'
|
PACKAGE_NAME = 'yt_dlp_plugins'
|
||||||
COMPAT_PACKAGE_NAME = 'ytdlp_plugins'
|
COMPAT_PACKAGE_NAME = 'ytdlp_plugins'
|
||||||
|
_BASE_PACKAGE_PATH = Path(__file__).parent
|
||||||
|
|
||||||
|
|
||||||
|
# Please Note: Due to necessary changes and the complex nature involved,
|
||||||
|
# no backwards compatibility is guaranteed for the plugin system API.
|
||||||
|
# However, we will still try our best.
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
'COMPAT_PACKAGE_NAME',
|
||||||
|
'PACKAGE_NAME',
|
||||||
|
'PluginSpec',
|
||||||
|
'directories',
|
||||||
|
'load_all_plugins',
|
||||||
|
'load_plugins',
|
||||||
|
'register_plugin_spec',
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
@dataclasses.dataclass
|
||||||
|
class PluginSpec:
|
||||||
|
module_name: str
|
||||||
|
suffix: str
|
||||||
|
destination: Indirect
|
||||||
|
plugin_destination: Indirect
|
||||||
|
|
||||||
|
|
||||||
class PluginLoader(importlib.abc.Loader):
|
class PluginLoader(importlib.abc.Loader):
|
||||||
@ -44,7 +76,42 @@ def dirs_in_zip(archive):
|
|||||||
pass
|
pass
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
write_string(f'WARNING: Could not read zip file {archive}: {e}\n')
|
write_string(f'WARNING: Could not read zip file {archive}: {e}\n')
|
||||||
return set()
|
return ()
|
||||||
|
|
||||||
|
|
||||||
|
def default_plugin_paths():
|
||||||
|
def _get_package_paths(*root_paths, containing_folder):
|
||||||
|
for config_dir in orderedSet(map(Path, root_paths), lazy=True):
|
||||||
|
# We need to filter the base path added when running __main__.py directly
|
||||||
|
if config_dir == _BASE_PACKAGE_PATH:
|
||||||
|
continue
|
||||||
|
with contextlib.suppress(OSError):
|
||||||
|
yield from (config_dir / containing_folder).iterdir()
|
||||||
|
|
||||||
|
# Load from yt-dlp config folders
|
||||||
|
yield from _get_package_paths(
|
||||||
|
*get_user_config_dirs('yt-dlp'),
|
||||||
|
*get_system_config_dirs('yt-dlp'),
|
||||||
|
containing_folder='plugins',
|
||||||
|
)
|
||||||
|
|
||||||
|
# Load from yt-dlp-plugins folders
|
||||||
|
yield from _get_package_paths(
|
||||||
|
get_executable_path(),
|
||||||
|
*get_user_config_dirs(''),
|
||||||
|
*get_system_config_dirs(''),
|
||||||
|
containing_folder='yt-dlp-plugins',
|
||||||
|
)
|
||||||
|
|
||||||
|
# Load from PYTHONPATH directories
|
||||||
|
yield from (path for path in map(Path, sys.path) if path != _BASE_PACKAGE_PATH)
|
||||||
|
|
||||||
|
|
||||||
|
def candidate_plugin_paths(candidate):
|
||||||
|
candidate_path = Path(candidate)
|
||||||
|
if not candidate_path.is_dir():
|
||||||
|
raise ValueError(f'Invalid plugin directory: {candidate_path}')
|
||||||
|
yield from candidate_path.iterdir()
|
||||||
|
|
||||||
|
|
||||||
class PluginFinder(importlib.abc.MetaPathFinder):
|
class PluginFinder(importlib.abc.MetaPathFinder):
|
||||||
@ -56,40 +123,16 @@ class PluginFinder(importlib.abc.MetaPathFinder):
|
|||||||
|
|
||||||
def __init__(self, *packages):
|
def __init__(self, *packages):
|
||||||
self._zip_content_cache = {}
|
self._zip_content_cache = {}
|
||||||
self.packages = set(itertools.chain.from_iterable(
|
self.packages = set(
|
||||||
|
itertools.chain.from_iterable(
|
||||||
itertools.accumulate(name.split('.'), lambda a, b: '.'.join((a, b)))
|
itertools.accumulate(name.split('.'), lambda a, b: '.'.join((a, b)))
|
||||||
for name in packages))
|
for name in packages))
|
||||||
|
|
||||||
def search_locations(self, fullname):
|
def search_locations(self, fullname):
|
||||||
candidate_locations = []
|
candidate_locations = itertools.chain.from_iterable(
|
||||||
|
default_plugin_paths() if candidate == 'default' else candidate_plugin_paths(candidate)
|
||||||
def _get_package_paths(*root_paths, containing_folder='plugins'):
|
for candidate in plugin_dirs.value
|
||||||
for config_dir in orderedSet(map(Path, root_paths), lazy=True):
|
)
|
||||||
with contextlib.suppress(OSError):
|
|
||||||
yield from (config_dir / containing_folder).iterdir()
|
|
||||||
|
|
||||||
# Load from yt-dlp config folders
|
|
||||||
candidate_locations.extend(_get_package_paths(
|
|
||||||
*get_user_config_dirs('yt-dlp'),
|
|
||||||
*get_system_config_dirs('yt-dlp'),
|
|
||||||
containing_folder='plugins'))
|
|
||||||
|
|
||||||
# Load from yt-dlp-plugins folders
|
|
||||||
candidate_locations.extend(_get_package_paths(
|
|
||||||
get_executable_path(),
|
|
||||||
*get_user_config_dirs(''),
|
|
||||||
*get_system_config_dirs(''),
|
|
||||||
containing_folder='yt-dlp-plugins'))
|
|
||||||
|
|
||||||
candidate_locations.extend(map(Path, sys.path)) # PYTHONPATH
|
|
||||||
with contextlib.suppress(ValueError): # Added when running __main__.py directly
|
|
||||||
candidate_locations.remove(Path(__file__).parent)
|
|
||||||
|
|
||||||
# TODO(coletdjnz): remove when plugin globals system is implemented
|
|
||||||
if Config._plugin_dirs:
|
|
||||||
candidate_locations.extend(_get_package_paths(
|
|
||||||
*Config._plugin_dirs,
|
|
||||||
containing_folder=''))
|
|
||||||
|
|
||||||
parts = Path(*fullname.split('.'))
|
parts = Path(*fullname.split('.'))
|
||||||
for path in orderedSet(candidate_locations, lazy=True):
|
for path in orderedSet(candidate_locations, lazy=True):
|
||||||
@ -109,7 +152,8 @@ def find_spec(self, fullname, path=None, target=None):
|
|||||||
|
|
||||||
search_locations = list(map(str, self.search_locations(fullname)))
|
search_locations = list(map(str, self.search_locations(fullname)))
|
||||||
if not search_locations:
|
if not search_locations:
|
||||||
return None
|
# Prevent using built-in meta finders for searching plugins.
|
||||||
|
raise ModuleNotFoundError(fullname)
|
||||||
|
|
||||||
spec = importlib.machinery.ModuleSpec(fullname, PluginLoader(), is_package=True)
|
spec = importlib.machinery.ModuleSpec(fullname, PluginLoader(), is_package=True)
|
||||||
spec.submodule_search_locations = search_locations
|
spec.submodule_search_locations = search_locations
|
||||||
@ -123,8 +167,10 @@ def invalidate_caches(self):
|
|||||||
|
|
||||||
|
|
||||||
def directories():
|
def directories():
|
||||||
spec = importlib.util.find_spec(PACKAGE_NAME)
|
with contextlib.suppress(ModuleNotFoundError):
|
||||||
return spec.submodule_search_locations if spec else []
|
if spec := importlib.util.find_spec(PACKAGE_NAME):
|
||||||
|
return list(spec.submodule_search_locations)
|
||||||
|
return []
|
||||||
|
|
||||||
|
|
||||||
def iter_modules(subpackage):
|
def iter_modules(subpackage):
|
||||||
@ -134,19 +180,23 @@ def iter_modules(subpackage):
|
|||||||
yield from pkgutil.iter_modules(path=pkg.__path__, prefix=f'{fullname}.')
|
yield from pkgutil.iter_modules(path=pkg.__path__, prefix=f'{fullname}.')
|
||||||
|
|
||||||
|
|
||||||
def load_module(module, module_name, suffix):
|
def get_regular_classes(module, module_name, suffix):
|
||||||
|
# Find standard public plugin classes (not overrides)
|
||||||
return inspect.getmembers(module, lambda obj: (
|
return inspect.getmembers(module, lambda obj: (
|
||||||
inspect.isclass(obj)
|
inspect.isclass(obj)
|
||||||
and obj.__name__.endswith(suffix)
|
and obj.__name__.endswith(suffix)
|
||||||
and obj.__module__.startswith(module_name)
|
and obj.__module__.startswith(module_name)
|
||||||
and not obj.__name__.startswith('_')
|
and not obj.__name__.startswith('_')
|
||||||
and obj.__name__ in getattr(module, '__all__', [obj.__name__])))
|
and obj.__name__ in getattr(module, '__all__', [obj.__name__])
|
||||||
|
and getattr(obj, 'PLUGIN_NAME', None) is None
|
||||||
|
))
|
||||||
|
|
||||||
|
|
||||||
def load_plugins(name, suffix):
|
def load_plugins(plugin_spec: PluginSpec):
|
||||||
classes = {}
|
name, suffix = plugin_spec.module_name, plugin_spec.suffix
|
||||||
if os.environ.get('YTDLP_NO_PLUGINS'):
|
regular_classes = {}
|
||||||
return classes
|
if os.environ.get('YTDLP_NO_PLUGINS') or not plugin_dirs.value:
|
||||||
|
return regular_classes
|
||||||
|
|
||||||
for finder, module_name, _ in iter_modules(name):
|
for finder, module_name, _ in iter_modules(name):
|
||||||
if any(x.startswith('_') for x in module_name.split('.')):
|
if any(x.startswith('_') for x in module_name.split('.')):
|
||||||
@ -163,24 +213,42 @@ def load_plugins(name, suffix):
|
|||||||
sys.modules[module_name] = module
|
sys.modules[module_name] = module
|
||||||
spec.loader.exec_module(module)
|
spec.loader.exec_module(module)
|
||||||
except Exception:
|
except Exception:
|
||||||
write_string(f'Error while importing module {module_name!r}\n{traceback.format_exc(limit=-1)}')
|
write_string(
|
||||||
|
f'Error while importing module {module_name!r}\n{traceback.format_exc(limit=-1)}',
|
||||||
|
)
|
||||||
continue
|
continue
|
||||||
classes.update(load_module(module, module_name, suffix))
|
regular_classes.update(get_regular_classes(module, module_name, suffix))
|
||||||
|
|
||||||
# Compat: old plugin system using __init__.py
|
# Compat: old plugin system using __init__.py
|
||||||
# Note: plugins imported this way do not show up in directories()
|
# Note: plugins imported this way do not show up in directories()
|
||||||
# nor are considered part of the yt_dlp_plugins namespace package
|
# nor are considered part of the yt_dlp_plugins namespace package
|
||||||
|
if 'default' in plugin_dirs.value:
|
||||||
with contextlib.suppress(FileNotFoundError):
|
with contextlib.suppress(FileNotFoundError):
|
||||||
spec = importlib.util.spec_from_file_location(
|
spec = importlib.util.spec_from_file_location(
|
||||||
name, Path(get_executable_path(), COMPAT_PACKAGE_NAME, name, '__init__.py'))
|
name,
|
||||||
|
Path(get_executable_path(), COMPAT_PACKAGE_NAME, name, '__init__.py'),
|
||||||
|
)
|
||||||
plugins = importlib.util.module_from_spec(spec)
|
plugins = importlib.util.module_from_spec(spec)
|
||||||
sys.modules[spec.name] = plugins
|
sys.modules[spec.name] = plugins
|
||||||
spec.loader.exec_module(plugins)
|
spec.loader.exec_module(plugins)
|
||||||
classes.update(load_module(plugins, spec.name, suffix))
|
regular_classes.update(get_regular_classes(plugins, spec.name, suffix))
|
||||||
|
|
||||||
return classes
|
# Add the classes into the global plugin lookup for that type
|
||||||
|
plugin_spec.plugin_destination.value = regular_classes
|
||||||
|
# We want to prepend to the main lookup for that type
|
||||||
|
plugin_spec.destination.value = merge_dicts(regular_classes, plugin_spec.destination.value)
|
||||||
|
|
||||||
|
return regular_classes
|
||||||
|
|
||||||
|
|
||||||
sys.meta_path.insert(0, PluginFinder(f'{PACKAGE_NAME}.extractor', f'{PACKAGE_NAME}.postprocessor'))
|
def load_all_plugins():
|
||||||
|
for plugin_spec in plugin_specs.value.values():
|
||||||
|
load_plugins(plugin_spec)
|
||||||
|
all_plugins_loaded.value = True
|
||||||
|
|
||||||
__all__ = ['COMPAT_PACKAGE_NAME', 'PACKAGE_NAME', 'directories', 'load_plugins']
|
|
||||||
|
def register_plugin_spec(plugin_spec: PluginSpec):
|
||||||
|
# If the plugin spec for a module is already registered, it will not be added again
|
||||||
|
if plugin_spec.module_name not in plugin_specs.value:
|
||||||
|
plugin_specs.value[plugin_spec.module_name] = plugin_spec
|
||||||
|
sys.meta_path.insert(0, PluginFinder(f'{PACKAGE_NAME}.{plugin_spec.module_name}'))
|
||||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user