Node.jsでHTMLを読み込む

2024-04-17

Node.js で基本的な HTML を読み込む

fs モジュールを使う

Node.js の組み込みモジュールである fs (file system) を使うと、ファイルシステムへのアクセスが可能になります。このモジュールを使って、HTML ファイルを読み込み、その内容を文字列として取得することができます。

const fs = require('fs');

fs.readFile('index.html', 'utf8', (err, data) => {
  if (err) {
    console.error(err);
    return;
  }
  console.log(data);
});

上記のコードでは、fs.readFile 関数を使って index.html ファイルを読み込んでいます。読み込みが完了すると、callback 関数が呼ばれ、第 1 引数にエラー情報、第 2 引数にファイルの内容が渡されます。エラー情報がない場合は、ファイルの内容をコンソールに出力しています。

テンプレートエンジンを使うと、HTML ファイルをより動的に生成することができます。代表的なテンプレートエンジンとしては、EJS や Pug などがあります。

EJS を使う例

const express = require('express');
const ejs = require('ejs');

const app = express();
app.set('view engine', 'ejs');

app.get('/', (req, res) => {
  const title = 'Hello World';
  const content = 'This is a sample content.';
  res.render('index', { title, content });
});

app.listen(3000, () => {
  console.log('Server listening on port 3000');
});

上記のコードでは、EJS を使って index.ejs ファイルをレンダリングしています。index.ejs ファイルには以下の内容を記述します。

<!DOCTYPE html>
<html>
<head>
  <title><%= title %></title>
</head>
<body>
  <h1><%= title %></h1>
  <p><%= content %></p>
</body>
</html>

テンプレートエンジンを使うと、変数やデータを使って HTML を動的に生成することができます。

補足

上記はほんの一例です。Node.js で HTML を読み込む方法は他にもたくさんあります。
より複雑な HTML を読み込む場合は、DOM パーサーを使うと便利です。
セキュリティ対策として、ユーザーが送信した HTML コードをエスケープ処理する必要があります。

const fs = require('fs');

fs.readFile('index.html', 'utf8', (err, data) => {
  if (err) {
    console.error(err);
    return;
  }
  console.log(data);
});

サーバー側

const express = require('express');
const ejs = require('ejs');

const app = express();
app.set('view engine', 'ejs');

app.get('/', (req, res) => {
  const title = 'Hello World';
  const content = 'This is a sample content.';
  res.render('index', { title, content });
});

app.listen(3000, () => {
  console.log('Server listening on port 3000');
});

HTML ファイル (index.ejs)

<!DOCTYPE html>
<html>
<head>
  <title><%= title %></title>
</head>
<body>
  <h1><%= title %></h1>
  <p><%= content %></p>
</body>
</html>

説明

このコードは fs モジュールを使って index.html ファイルを読み込みます。
fs.readFile 関数は、ファイルを読み込むために使用されます。
第 1 引数に読み込むファイルのパス、第 2 引数にエンコーディング形式、第 3 引数にコールバック関数を指定します。
コールバック関数は、エラー情報とファイルの内容を引数として受け取ります。
エラー情報がない場合は、ファイルの内容をコンソールに出力します。

このコードは EJS を使って index.ejs ファイルをレンダリングします。
express.js フレームワークを使用しています。
app.set('view engine', 'ejs') 行で、テンプレートエンジンとして EJS を設定します。
app.get('/', ...) 行で、/ パスへの GET リクエストを処理します。
res.render('index', { title, content }) 行で、index.ejs ファイルをレンダリングし、title と content という変数をテンプレートに渡します。
index.ejs ファイルは、EJS テンプレート構文を使用して HTML を記述します。
<%= ... %> タグを使って、テンプレート変数を参照することができます。

実行方法

上記コードを実行するには、以下の手順が必要です。

Node.js をインストールします。
上記コードを index.js などのファイルに保存します。
コマンドプロンプトで、以下のコマンドを実行します。

node index.js

上記のコマンドを実行すると、サーバーが起動し、http://localhost:3000/ にアクセスすると、index.html ファイルの内容が表示されます。

Node.js で HTML を読み込むその他の方法

HTTP リクエストを使って読み込む

Web サーバーとして Node.js を使用している場合は、HTTP リクエストを使って HTML ファイルを読み込むことができます。

const http = require('http');

const server = http.createServer((req, res) => {
  if (req.url === '/') {
    fs.readFile('index.html', 'utf8', (err, data) => {
      if (err) {
        console.error(err);
        res.statusCode = 500;
        res.end('Internal Server Error');
        return;
      }
      res.setHeader('Content-Type', 'text/html');
      res.end(data);
    });
  } else {
    res.statusCode = 404;
    res.end('Not Found');
  }
});

server.listen(3000, () => {
  console.log('Server listening on port 3000');
});

上記のコードでは、HTTP サーバーを作成し、/ パスへの GET リクエストを処理しています。fs.readFile 関数を使って index.html ファイルを読み込み、その内容をレスポンスとして返しています。

DOM パーサーを使う

複雑な HTML 構造を扱う場合は、DOM パーサーを使うと便利です。DOM パーサーは、HTML をツリー構造として解析し、各要素にアクセスすることができます。

const fs = require('fs');
const DOMParser = require('dom-parser');

fs.readFile('index.html', 'utf8', (err, data) => {
  if (err) {
    console.error(err);
    return;
  }

  const parser = new DOMParser();
  const doc = parser.parseFromString(data, 'text/html');

  const title = doc.getElementsByTagName('title')[0].textContent;
  const content = doc.getElementById('content').textContent;

  console.log(title);
  console.log(content);
});

上記のコードでは、DOMParser モジュールを使って index.html ファイルを解析しています。getElementsByTagName メソッドや getElementById メソッドを使って、特定の HTML 要素にアクセスし、その内容を取得しています。

Web スクレイピングライブラリを使う

Web スクレイピングを行う場合は、cheerio や puppeteer などの Web スクレイピングライブラリを使うと便利です。これらのライブラリは、Web ページから HTML を取得し、解析する機能を提供しています。

const cheerio = require('cheerio');
const request = require('request');

request('https://example.com', (err, res, body) => {
  if (err) {
    console.error(err);
    return;
  }

  const $ = cheerio.load(body);
  const title = $('title').text();
  const content = $('#content').text();

  console.log(title);
  console.log(content);
});