python處理文件內(nèi)容的正確姿勢該怎樣?
問題描述
大神們:
我想把htm文件中的第一個<link到第二個<link之間的所有內(nèi)容另存為一個htm該怎么寫比較簡潔。
<meta http-equiv='X-UA-Compatible' content='IE=edge'><link rel='prefetch' ><meta name='application-name' content='Python.org'><meta name='msapplication-tooltip' content='The official home of the Python Programming Language'><meta name='apple-mobile-web-app-title' content='Python.org'><meta name='apple-mobile-web-app-capable' content='yes'><meta name='apple-mobile-web-app-status-bar-style' content='black'><meta name='viewport' content='width=device-width, initial-scale=1.0'><meta name='HandheldFriendly' content='True'><meta name='format-detection' content='telephone=no'><meta http-equiv='cleartype' content='on'><meta http-equiv='imagetoolbar' content='false'><script type='text/javascript' async='' src='https://ssl.google-analytics.com/ga.js'></script><script src='http://www.aoyou183.cn/wenda/Welcome to Python.org_files/modernizr.js.下載'></script><style type='text/css' adt='123'></style><link href='http://www.aoyou183.cn/wenda/Welcome to Python.org_files/style.css' rel='stylesheet' type='text/css'><link href='http://www.aoyou183.cn/wenda/Welcome to Python.org_files/mq.css' rel='stylesheet' type='text/css' media='not print, braille, embossed, speech, tty'>
提取的內(nèi)容應(yīng)該是:
<link rel='prefetch' ><meta name='application-name' content='Python.org'><meta name='msapplication-tooltip' content='The official home of the Python Programming Language'><meta name='apple-mobile-web-app-title' content='Python.org'><meta name='apple-mobile-web-app-capable' content='yes'><meta name='apple-mobile-web-app-status-bar-style' content='black'><meta name='viewport' content='width=device-width, initial-scale=1.0'><meta name='HandheldFriendly' content='True'><meta name='format-detection' content='telephone=no'><meta http-equiv='cleartype' content='on'><meta http-equiv='imagetoolbar' content='false'><script type='text/javascript' async='' src='https://ssl.google-analytics.com/ga.js'></script><script src='http://www.aoyou183.cn/wenda/Welcome to Python.org_files/modernizr.js.下載'></script><style type='text/css' adt='123'></style><link
問題解答
回答1:import retext = ''with open('read.html', 'r') as rf: text = rf.read() pattern = r'<link[sS]*?<link'results = re.findall(pattern, text)if results: r = results[0] with open('write.html', 'w') as wf:wf.write(r) ================================================with open('read.html', 'r') as rf: with open('write.html', 'w') as wf:num = 0for line in rf.readlines(): if line.startswith('<link'):num += 1continue if num == 2:break wf.writelines(line)
相關(guān)文章:
1. apache - 想把之前寫的單機版 windows 軟件改成網(wǎng)絡(luò)版,讓每個用戶可以注冊并登錄。類似 qq 的登陸,怎么架設(shè)服務(wù)器呢?2. css3 - Typecho 后臺部分表單按鈕在 Chrome 下出現(xiàn)靈異動畫問題,求解決3. java - 阿里的開發(fā)手冊中為什么禁用map來作為查詢的接受類?4. java - 關(guān)于i++的一個題目5. javascript - 為什么嵌套的Promise不能按預(yù)期捕獲Exception?6. webgl - android上類似汽車之家的3d全景照片怎么實現(xiàn)7. javascript - 編程,算法的問題8. java - HTTPS雙向認證基礎(chǔ)上有無必要再進行加簽驗簽?9. ubuntu apt-get install update 無法更新10. node.js - win7下,npm 無法下載依賴包,淘寶鏡像也裝不上,求幫忙???
