.Net6在Docker环境下操作Selenium.Chrome的那些坑

.Net6中想实现对某个网址截屏，可通过Selenium模拟访问网址并实现截图。

实现

安装Nuget包

<PackageReference Include="Selenium.Chrome.WebDriver" Version="85.0.0" />
<PackageReference Include="Selenium.Support" Version="4.1.0" />
<PackageReference Include="Selenium.WebDriver" Version="4.1.0" />

之后可通过代码实现模拟访问网址并截图

public static string PageScreenshot(string url, string uploadbasepath)
{
    ChromeDriver driver = null;
    try
    {
        ChromeOptions options = new ChromeOptions();

        options.AddArguments("headless", "disable-gpu", "no-sandbox");
        driver = new ChromeDriver(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), options);

        //driver = new ChromeDriver("/usr/bin/google-chrome-stable", options);
        driver.Navigate().GoToUrl(url);
        string width = driver.ExecuteScript("return document.body.scrollWidth").ToString();
        string height = driver.ExecuteScript("return document.body.scrollHeight").ToString();
        driver.Manage().Window.Size = new System.Drawing.Size(int.Parse(width), int.Parse(height)); //=int.Parse( height);
        var screenshot = (driver as ITakesScreenshot).GetScreenshot();

        //directory create
        var basepath = uploadbasepath + DateTime.Now.ToString("yyyyMMdd") + "/";
        if (!Directory.Exists(uploadbasepath))
        {
            Directory.CreateDirectory(uploadbasepath);
        }
        if (!Directory.Exists(basepath))
        {
            Directory.CreateDirectory(basepath);
        }

        var path = basepath + Guid.NewGuid().ToString("N") + ".jpg";

        screenshot.SaveAsFile(path);
        return path;
    }
    catch (Exception ex)
    {
        throw;
    }
    finally
    {
        if (driver != null)
        {
            driver.Close();
            driver.Quit();
        }
    }
}

需要另外做的一步是把chromedriver从bin/Release/netcoreapp3.1/chromedriver复制到publish目录。

你以为到这就完了？这个代码确实可以在windows/linux非容器环境下运行。但是在docker里还是有些不一样。

Docker中运行的那些坑

首先需要注意.netcore3.1在Docker中操作图片记得安装libgdiplus.so

#Dockerfile
RUN apt-get update -y && apt-get install -y --allow-unauthenticated libgdiplus && apt-get clean && ln -s /usr/lib/libgdiplus.so /usr/lib/gdiplus.dll

1.第一个坑

首先遇到的就是OpenQA.Selenium.DriverServiceNotFoundException异常，异常信息是

OpenQA.Selenium.DriverServiceNotFoundException: The file /opt/google/chrome/chrome/chromedriver does not exist. The driver can be downloaded at http://chromedriver.storage.googleapis.com/index.html

这个异常明显是找不到chromedriver，那就与在非Docker环境linux中直接运行的方式一样，尝试把chromedriver复制到Docker的publish目录中，在Dockerfile中添加以下内容

#dockerfile

RUN cp /src/xxx/Release/netcoreapp3.1/chromedriver /app/publish/

2.第二个坑

尝试运行以上容器，还是失败，进入容器内部，直接运行chromedriver，可以看到缺少libxx.so之类的库。那咋办，只能尝试在镜像中安装chrome，这样相关库就有了

安装chrome相关资料

https://stackoverflow.com/questions/55206172/how-to-run-dotnet-core-app-with-selenium-in-docker

https://github.com/devpabloassis/seleniumdotnetcore/blob/master/Dockerfile

那在Dockerfile中添加安装chrome的命令

#Dockerfile Install Chrome
RUN apt-get update && apt-get install -y \
 apt-transport-https \
 ca-certificates \
 curl \
 gnupg \
 hicolor-icon-theme \
 libcanberra-gtk* \
 libgl1-mesa-dri \
 libgl1-mesa-glx \
 libpango1.0-0 \
 libpulse0 \
 libv4l-0 \
 fonts-symbola \
 --no-install-recommends \
 && curl -sSL https://dl.google.com/linux/linux_signing_key.pub | apt-key add - \
 && echo "deb [arch=amd64] https://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google.list \
 && apt-get update && apt-get install -y \
 google-chrome-stable \
 --no-install-recommends \
 && apt-get purge --auto-remove -y curl \
 && rm -rf /var/lib/apt/lists/*

3.第三个坑

运行以上修改后的容器，又一个异常

DevToolsActivePort file doesn't exist

继续查资料发现需要加个参数disable-dev-shm-usage

https://stackoverflow.com/questions/50642308/webdriverexception-unknown-error-devtoolsactiveport-file-doesnt-exist-while-t

但是前面试了不在docker内运行，需要这个参数，那就加个环境变量区分开docker与非docker环境

#Dockerfile

ENV INDOCKER 1

public static string PageScreenshot(string url, string uploadbasepath)
{
    ChromeDriver driver = null;
    try
    {
        var indocker = Environment.GetEnvironmentVariable("INDOCKER");
        ChromeOptions options = new ChromeOptions();

        if (indocker == "1") 
        {
            options.AddArguments("headless", "disable-gpu", "no-sandbox", "disable-dev-shm-usage");
            //driver = new ChromeDriver("/opt/google/chrome/chrome", options);
        }
        else
        {
            options.AddArguments("headless", "disable-gpu", "no-sandbox");
        }
        driver = new ChromeDriver(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), options);

        //driver = new ChromeDriver("/usr/bin/google-chrome-stable", options);
        driver.Navigate().GoToUrl(url);
        string width = driver.ExecuteScript("return document.body.scrollWidth").ToString();
        string height = driver.ExecuteScript("return document.body.scrollHeight").ToString();
        driver.Manage().Window.Size = new System.Drawing.Size(int.Parse(width), int.Parse(height)); //=int.Parse( height);
        var screenshot = (driver as ITakesScreenshot).GetScreenshot();

        //directory create
        var basepath = uploadbasepath + DateTime.Now.ToString("yyyyMMdd") + "/";
        if (!Directory.Exists(uploadbasepath))
        {
            Directory.CreateDirectory(uploadbasepath);
        }
        if (!Directory.Exists(basepath))
        {
            Directory.CreateDirectory(basepath);
        }

        var path = basepath + Guid.NewGuid().ToString("N") + ".jpg";

        screenshot.SaveAsFile(path);
        return path;
    }
    catch (Exception ex)
    {
        throw;
    }
    finally
    {
        if (driver != null)
        {
            driver.Close();
            driver.Quit();
        }
    }
}

4.第四个坑

尝试运行上面修改后的容器，又一个异常

This version of ChromeDriver only supports Chrome version 99
Current browser version is 109.0.5414.74 with binary path /usr/bin/google-chrome

这个信息字面意思就是之前第一个坑复制的chromedriver版本较低。那就直接去官网下载最新的chromedriver，并放到镜像内

下载地址：http://chromedriver.storage.googleapis.com/index.html

# Dockerfile
COPY ["xxx/chromedriver", "."]
RUN chmod +x chromedriver

5.第五个坑

继续尝试运行，发现这次能成功截图了，等等...这字体咋还是乱码呢

.Net6在Docker环境下操作Selenium.Chrome的那些坑

明显是中文乱码了，应该是容器内没中文字体，那就安装中文字体，字体可以从C:\Windows\Fonts中获取ttc,ttf字体文件

#Dockerfile

RUN apt-get update
RUN apt-get install -y --no-install-recommends libgdiplus libc6-dev 
RUN apt-get install -y fontconfig xfonts-utils
COPY fonts/  /usr/share/fonts/
RUN mkfontscale
RUN mkfontdir
RUN fc-cache -fv

再次运行，终于成功

.Net6在Docker环境下操作Selenium.Chrome的那些坑