Yahoo Answers is shutting down on May 4th, 2021 (Eastern Time) and beginning April 20th, 2021 (Eastern Time) the Yahoo Answers website will be in read-only mode. There will be no changes to other Yahoo properties or services, or your Yahoo account. You can find more information about the Yahoo Answers shutdown and how to download your data on this help page.
Trending News
Python: Getting files, downloading them, and moving them?
I have a list of links that I have gotten from an XML document with Python. I want to get the latest link, get the file that it links to, and move that file to a directory on my computer. How can I do this?
1 Answer
- eli porterLv 51 decade agoFavorite Answer
you want to look at the urllib module, as for getting the latest link, it depends how your xml is formatted assume the xml is something like this:
(we'll call it myxml.xml)
<?xml version="1.0" ?>
<links>
<link>http://newest.com/file.txt%3C/link%3E
<link>http://older.com/file.txt%3C/link%3E
</links>
from xml.dom import minidom
import urllib
xmldoc = minidom.parse("myxml.xml")
linknode = xmldoc.childNodes[0]
urldata = linksnode.childNodes[1]
urllib.urlretrieve(urldata.firstChild.data, "C:\\whereyouwant\\")
# >_< yahoo is cutting it off, its .data, after firstChild
ok so linknode is equal to the first child of xmldoc which is links
urldata is equal to the second child of links which is link (why the second and no the first? because the python xml parser isnt all that great and lists the first child as '\n' the newline :/ )
the urldata.firstchild.data lists the actual string inside the brackets
EDIT: yahoo is being weird with the xml I put in, you know how xml works, open tag, close tag, make sure to format it appropriately and not the way its showing above
i.e. [link]url[/link] and NOT the [link][\link]url[\link] that yahoo is showing