Playwright for Python入门指南——库

自动化测试 · 2023-06-30

安装

PIP

pip install --upgrade pip
pip install playwright
playwright install

CONDA

conda config --add channels conda-forge
conda config --add channels microsoft
conda install playwright
playwright install

这些命令将下载 Playwright 包并为 Chromium、Firefox 和 WebKit 安装浏览器二进制文件。要修改此行为,请参阅安装参数

使用

一旦安装,你可以在 Python 脚本中导入 Playwright,并启动任何 3 种浏览器(chromium、firefox 和 webkit)。

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("http://playwright.dev")
    print(page.title())
    browser.close()

Playwright 支持两种 API:同步和异步。如果你的现代项目使用 asyncio,你应该使用异步 API:

import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()
        await page.goto("http://playwright.dev")
        print(await page.title())
        await browser.close()

asyncio.run(main())

第一个脚本

在我们的第一个脚本中,我们将导航到 https://playwright.dev/ 并在 WebKit 中进行截图。

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.webkit.launch()
    page = browser.new_page()
    page.goto("https://playwright.dev/")
    page.screenshot(path="example.png")
    browser.close()

默认情况下,Playwright 以无头模式运行浏览器。要查看浏览器 UI,请在启动浏览器时传递 headless=False 标志。你也可以使用 slow_mo 来减慢执行速度。在调试工具部分了解更多。

firefox.launch(headless=False, slow_mo=50)

交互模式 (REPL)

你可以启动交互式 python REPL:

python

然后在其中启动 Playwright 进行快速实验:

>>> from playwright.sync_api import sync_playwright
>>> playwright = sync_playwright().start()
# 使用 playwright.chromium、playwright.firefox 或 playwright.webkit
# 在 launch() 中传递 headless=False 以查看浏览器 UI
>>> browser = playwright.chromium.launch()
>>> page = browser.new_page()
>>> page.goto("https://playwright.dev/")
>>> page.screenshot(path="example.png")
>>> browser.close()
>>> playwright.stop()

异步 REPL,例如 asyncio REPL:

python -m asyncio
>>> from playwright.async_api import async_playwright
>>> playwright = await async_playwright().start()
>>> browser = await playwright.chromium.launch()
>>> page = await browser.new_page()
>>> await page.goto("https://playwright.dev/")
>>> await page.screenshot(path="example.png")
>>> await browser.close()
>>> await playwright.stop()

PyInstaller

你可以使用 Playwright 和 Pyinstaller 创建独立的可执行文件。

main.py

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://playwright.dev/")
    page.screenshot(path="example.png")
    browser.close()

如果你想将浏览器与可执行文件捆绑在一起:

  • Bash

    PLAYWRIGHT_BROWSERS_PATH=0 playwright install chromium
    pyinstaller -F main.py

注意:将浏览器与可执行文件捆绑在一起会生成更大的二进制文件。建议只捆绑你使用的浏览器。

已知问题

time.sleep() 导致状态过时

你可能不需要手动等待,因为 Playwright 有自动等待。如果你仍然依赖于它,你应该使用 page.wait_for_timeout(5000) 而不是 time.sleep(5),最好不要等待超时,但有时它对于调试是有用的。在这些情况下,使用我们的等待(wait_for_timeout)方法,而不是 time 模块。这是因为我们内部依赖于异步操作,当使用 time.sleep(5) 时,它们无法正确处理。

与 Windows 上的 asyncio 的 SelectorEventLoop 不兼容

Playwright 在子进程中运行驱动程序,因此它需要 Windows 上的 asyncio 的 ProactorEventLoop,因为 SelectorEventLoop 不支持异步子进程。在 Windows Python 3.7 上,Playwright 将默认事件循环设置为 ProactorEventLoop,因为它是 Python 3.8+ 的默认设置。

多线程

Playwright 的 API 不是线程安全的。如果你在多线程环境中使用 Playwright,你应该为每个线程创建一个 playwright 实例。有关更多详细信息,请参阅线程问题

Playwright
Theme Jasmine by Kent Liao