0

I am not sure if this is a react-specific question, a general JavaScript related question or just a basic functionality question regarding websites and the loading of its sources (depending if one uses a browser or any other program to fetch the contents of a website / web app).

My goal:

I want to exclude bots from the website / frontend. Only humans should be able to access it and acess the information and functionalities provided there. E.g. email adresses, phone numbers and contact forms. However, I do not want the visitors to create accounts first.

I created a Node.js app using npx create-react-app my-app with the following relevant setup in package.json:

"dependencies": {
    "@testing-library/jest-dom": "^5.14.1",
    "@testing-library/react": "^11.2.7",
    "@testing-library/user-event": "^12.8.3",
    "react": "^17.0.2",
    "react-dom": "^17.0.2",
    "react-google-recaptcha-v3": "^1.9.5",
    "react-scripts": "4.0.0",
    "react-share": "^4.4.0",
    "react-spinners": "^0.11.0",
    "web-vitals": "^0.2.4",
    ...
  }

This is my index.js:

import React from 'react';
import ReactDOM from 'react-dom';
import './index.css';
import './i18n';
import reportWebVitals from './reportWebVitals';
import {GoogleReCaptchaProvider} from 'react-google-recaptcha-v3';
import {ValidationRecaptcha} from "./components/reCaptcha";

ReactDOM.render(
    <React.StrictMode>
        <GoogleReCaptchaProvider reCaptchaKey={process.env.REACT_APP_RECAPTCHA_PUBLIC_KEY}>
            <ValidationRecaptcha/>
        </GoogleReCaptchaProvider>
    </React.StrictMode>,
    document.getElementById('root')
);

reportWebVitals();

This is the ValidationRecaptcha component:

import React, {useCallback, useEffect, useState} from "react";
import {useGoogleReCaptcha} from "react-google-recaptcha-v3";
import App from "../App";
import {PuffLoader} from "react-spinners";

export const ValidationRecaptcha = () => {
    const { executeRecaptcha } = useGoogleReCaptcha();
    const [verified, setVerified] = useState(0);
    const handleReCaptchaVerify = useCallback(async () => {

        if (!executeRecaptcha) {
            console.log('Execute recaptcha not yet available');
            return;
        }

        console.log('Request token from Google...');
        const token = await executeRecaptcha('token');
        console.log('Received token from Google: ' + token);

        function handleReCaptchaResponse(success) {
            setVerified(success ? 2 : 1);
        }

        // check the token with the backend
        const url = process.env.REACT_APP_BACKEND_API_URL + "/recaptcha?token=" + token.toString();
        console.log('Request token verification from backend: ' + url);
        fetch(url)
            .then((res) => res.json())
            .then((data) => handleReCaptchaResponse(JSON.parse(data.success)));
        console.log('Answer from API: success = ' + verified);

    }, [executeRecaptcha, verified]);

    // use useEffect to trigger the verification as soon as the component being loaded
    useEffect(() => {
        handleReCaptchaVerify();
    }, [handleReCaptchaVerify]);

    let content;
    if (verified === 0) {
        content = <div><div className={'center'}><PuffLoader/></div><div className={'center-horizontally'}>Checking your browser with Google reCAPTCHA v3...</div></div>;
    } else if (verified === 1) {
        content = <div className={'center'}>Sorry, looks like you're a Robot . This site is humans-only!</div>;
    } else {
        content = <App/>;
    }
    return content;
};

Quick explanation:

The code works as expected in browsers (I've tested Chrome, Safari and Firefox). While waiting for the data fetching, the app renders the content assigned in the first case distinction, i.e. verified === 0. When the response is fully fetched and is negative (i.e. token is not valid or score indicates a bot), the app renders the content assigned in the second case destinction, i.e. verified === 1. When the response is fully fetched and is positive (i.e. token is valid), the app renders the content assigned in the third case destinction, i.e. verified === 3 / here: else. The last case renders the <App/> itself with all its functionalities, its static contents (like imprint text with email adresses, pone numbers and addresses) and a contact form.

Now my questions:

  1. Is this even reasonable or is a programm (e.g. a spam bot that scrapes websites) able to fetch all the jsx files (and so can still access public personal data, like emails, phone numbers, adresses etc. in the imprint or any other component that is way down the hierarchy, i.e. childs of the <App/> component)? When I test a invalid token or score manually, I see the verified === 1 content and Chrome only shows the jsx files that were already rendered in the sources, not the rest. I can also search for a email address in all the sources without any matches. But this email address is clear texted in a child component of <App/> (I dont want answers / advices on how to hide emails from bots, this is not my question). When I test it with a valid token and score, I get to see my app and for sure all the source files.
  2. So can I assume that a react app only makes the files available for public that are needed / rendered by runtime?
  3. In principle this question equals the first question, but I will ask anyway to underline my understanding lack / problem: Is this already saving the contact form way down the hierarchy from spam bot attacks or do I have to check there again before sending the data to the smtp service (either by passing the verified down with props or by using another reCAPTCHA component for the submit button of the form, but again, this is not the question)?

I think these questions result from a lack of knowledge about react and how it works, so when someone has good videos / tutorials / introductions that could help me here, I would be thankful too!

Best, maximotus

maximotus
  • 21
  • 1
  • 5
  • 1
    You created a React App using Node.js to run webpack and other packages... not a node.js app. 1. Some bots won't see your code because they watch for html output (ctrl+u) on browser. 2. Some bots might take your script, and get data from the script Everything you physically place in your JS files is available in the code. So a smart bot would be able to get all the data - you could make it harder for a bot, but it would be able to get it. 3. If you are fetching data from an api (phone numbers) then some techniques like captchas or click to show number.. MIGHT make it more difficult. – Bartek B. Oct 22 '21 at 19:07
  • @BartekB. Thanks for your help! Oh yes, it's not a node.js app for sure! So I can summarize as follows: 1. My code is not purposeless since it helps against "stupid" html-only bots. 2. However, sensitive data should not be placed physically in a react-related JS file since "advanced" bots can get everything thats placed into the frontend code, right? 3. Even captchas are not able to prevent bots from fetching data in general, so the only real purpose of techniques like that is to make it harder (but not impossible)? – maximotus Oct 25 '21 at 13:22

0 Answers0