There are many way we can download website html in C# and export it as PDF or however we wanted. But most of these ways not having option to render complete JavaSript and render the page fully.
Option 1:
Simply use WebClient and export website into html then to PDF
using (WebClient client = new WebClient())
{
byte[] websiteData = client.DownloadData("somewebsiteurl");
File.WriteAllBytes(“savepath”, websiteData);
//do further steps to convert html to pdf
}
Option 2:
Use Headless browser with Puppetter to export as PDF from URL, here you can add more wait handlers to render the JavaScript
Ref:
https://developer.chrome.com/docs/puppeteer/
https://www.puppeteersharp.com/
Using JS:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({ path: 'example.png' });
await browser.close();
})();
Using C#:
using var browserFetcher = new BrowserFetcher();
await browserFetcher.DownloadAsync();
await using var browser = await Puppeteer.LaunchAsync(
new LaunchOptions { Headless = true });
await using var page = await browser.NewPageAsync();
await page.GoToAsync("http://www.google.com");
await page.ScreenshotAsync(outputFile);
Note: we can also use installed browsers for exporting website as below
Browser browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = true,
Args = "{ "--disable-features=site-per-process", "--disable-web-security" }",
ExecutablePath = “any chromimum browser exe path”
});
No comments:
Post a Comment