Search and Downlad Youtube Videos
Search and download youtube videos using Python
For example, searching for “Top English KTV” will scan for all the songs playlists found in the search results and collect the individual songs web link from each playlist to be downloaded locally. Users can choose either to download as video format or as audio format.
The script makes use of Python Pattern module for URL request and DOM object processing. For actual downloading of videos, it utilizes Pafy. Pafy is very comprehensive python module, allowing download in both video and audio format. There are other features of Pafy which is not used in this module.
The following are the main flow of the script.
- Form the YouTube search URL with the prefix “https://www.youtube.com/results?search_query=” and the search keyword
- Based on the above URL, scrape and get all the urls that are linked to a playlist. The Xpath for the playlist element can be easily obtained using any web browser developer options, inspecting the element and retrieving the Xpath. The playlist url can be obtained using pattern dom object: ‘dom_object(div ul li a[class=”yt-uix-sessionlink”])’.
- Filter the list of extracted link to cater only for URL link starting with “/playlist?“. A typical url for playlist looks something like below:
- From the list of playlist, scrape the individual playlist webpage to retrieve the url link for each individual videos. The playlist element can be retrieved using pattern dom object: ‘dom_object(div ul li a[class=”yt-uix-sessionlink”])’.
- Download each individual video/audio to local computer using Pafy module by passing in the video URL to Pafy.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| from youtube_search_and_download import YouTubeHandlersearch_key = 'chinese top ktv' #keywordsyy = YouTubeHandler(search_key)yy.download_as_audio =1 # 1- download as audio format, 0 - download as videoyy.set_num_playlist_to_extract(5) # number of playlist to downloadprint 'Get all the playlist'yy.get_playlist_url_list()print yy.playlist_url_list## Get all the individual video and title from each of the playlistyy.get_video_link_fr_all_playlist()for key in yy.video_link_title_dict.keys(): print key, ' ', yy.video_link_title_dict[key] printprintprint 'download video'yy.download_all_videos(dl_limit =200) #number of videos to download. |
Comments
Post a Comment