I'm trying to rebuild a website that I scraped from the web using wget. It appears to be a next js application, as I see the _next folder. I have 0 experience in nextjs and have no idea what the inner workings are, but it seems like it is somehow minifying all the code into a single script.
Is there any way to "undo" this and make it look like pretty HTML?
Here is what it looks like
Unfortunately this isn't possible, as the HTML you scraped is pre-rendered static HTML by NextJS that's later hydrated by the JavaScript, which resides in the chunks folder.
To recreate the website, you'll first have to recreate all the JavaScript that was transpiled into chunks by WebPack or SWC, which is no easy task. It's laborious and can only be minimally automated, and there's no way to "demangle" code that's been transpiled back into its original form.
There might be a better solution to your question if you provide more information about your target and motivation behind doing so. Otherwise, I'd strongly recommend not spending time trying to reverse transpiled code.
Related
When developing javascript code, what are the best practices for maintaining the code in repositories?
For example, suppose I develop a set of useful functions and put them in a script called "sugar.js". In the code repository I put them in c:/codebase/suger.js.
Now I want to use the script in a web site being developed and I locate it at c:\mywebsite\sugar.js (ready for uploading to a server)
Do I keep a copy of sugar.js? What if I fix sugar.js in one location - it won't be synchronized with the other?
What if I build a second web site that also uses sugar.js? Do I take another copy located at, say, c:\mywebsite2\sugar.js?
If you are using something like visual studio, you can use NuGet for versioning many of the popular javascript frameworks on a per-project basis.
If you are writing in something else, you could try package managers such as npmjs or http://jspkg.com/JSPkg.
If it is your own library, I would recommend setting up source control and having versioned releases as branches or tags, that way you can keep track of everything. Git and GitHub support this type of thing, and you can set it up to have each version as a zipped download.
I would also try to keep each project's javascript files separate, that way any changes won't immediately break every site, just the one you recently updated. This advice could go out the window if you are running hundreds of sites and really just need a CDN.
I am relatively new to JavaScript and trying to find a way to get a good overall understanding of JavaScript projects, frameworks, etc.. For example when I look at a JavaScript based source on Github I would like a one page snapshot of the dependencies between the html, css and the various .js files requiring further js files( modules) , instead of looking at the source code tree and opening up the individual files. What I am looking for is either an object diagramming tool or something like a "file diagram".
Is there a tool out in the wild already doing this? (and ,yes I have already tried Google-ing it)
(I used to use a tool in the Windows world for tracking DLLs which is a similar concept.)
https://github.com/nodejitsu/require-analyzer gets you part of the way there.
One could also implement a file dependency analyzer if you are looking for more comprehensive html/template analysis with these two:
http://nodejs.org/docs/v0.4.8/api/fs.html#fs.readdir
http://nodejs.org/docs/v0.4.8/api/fs.html#fs.watchFile
Using Firebug you can see the files requested by each page, the server response and you can filter them by type. The HTML view lets you see the entire page including related js/css content. I don't think it's exactly what you are looking for, but I find it helpful for this sort of thing.
here are some bookmarklet code that could help (taken from https://www.squarefree.com/bookmarklets/webdevel.html
view style sheet :
javascript:s=document.getElementsByTagName('STYLE');%20ex=document.getElementsByTagName('LINK');%20d=window.open().document;%20/set%20base%20href/d.open();d.close();%20b=d.body;%20function%20trim(s){return%20s.replace(/^\s*\n/,%20'').replace(/\s*$/,%20'');%20};%20function%20iff(a,b,c){return%20b?a+b+c:'';}function%20add(h){b.appendChild(h);}%20function%20makeTag(t){return%20d.createElement(t);}%20function%20makeText(tag,text){t=makeTag(tag);t.appendChild(d.createTextNode(text));%20return%20t;}%20add(makeText('style',%20'iframe{width:100%;height:18em;border:1px%20solid;'));%20add(makeText('h3',%20d.title='Style%20sheets%20in%20'%20+%20location.href));%20for(i=0;%20i
view scripts:
javascript:s=document.getElementsByTagName('SCRIPT');%20d=window.open().document;%20/140681/d.open();d.close();%20b=d.body;%20function%20trim(s){return%20s.replace(/^\s*\n/,%20'').replace(/\s*$/,%20'');%20};%20function%20add(h){b.appendChild(h);}%20function%20makeTag(t){return%20d.createElement(t);}%20function%20makeText(tag,text){t=makeTag(tag);t.appendChild(d.createTextNode(text));%20return%20t;}%20add(makeText('style',%20'iframe{width:100%;height:18em;border:1px%20solid;'));%20add(makeText('h3',%20d.title='Scripts%20in%20'%20+%20location.href));%20for(i=0;%20i
So we're writing a full-text search framework MongoDb. MongoDB is pretty much javascript-native, so we wrote the javascript library first, and it works.
Now I'm trying to write a python framework for it, which will be partially in python, but partially use those same stored javascript functions - the javascript functions are an intrinsic part of the library. On the other hand, the javascript framework does not depend on python. since they are pretty intertwined it seems like it's worthwhile keeping them in the same repository.
I'm trying to work out a way of structuring the whole project to give the javascript and python frameworks equal status (maybe a ruby driver or whatever in the future?), but still allow the python library to install nicely.
Currently it looks like this: (simplified a little)
javascript/jstest/test1.js
javascript/mongo-fulltext/search.js
javascript/mongo-fulltext/util.js
python/docs/indext.rst
python/tests/search_test.py
python/tests/__init__.py
python/mongofulltextsearch/__init__.py
python/mongofulltextsearch/mongo_search.py
python/mongofulltextsearch/util.py
python/setup.py
I've skipped out a few files for simplicity, but you get the general idea; it' a pretty much standard python project... except that it depends critcally ona whole bunch of javascript which is stored in a sibling directory tree.
What's the preferred setup for dealing with this kind of thing when it comes to setuptools? I can work out how to use package_data etc to install data files that live inside my python project as per the setuptools docs.
The problem is if i want to use setuptools to install stuff, including the javascript files from outside the python code tree, and then also access them in a consistent way when I'm developing the python code and when it is easy_installed to someone's site.
Is that supported behaviour for setuptools? Should i be using paver or distutils2 or Distribute or something? (basic distutils is not an option; the whole reason I'm doing this is to enable requirements tracking) How should i be reading the contents of those files into python scripts?
The short answer is that none of the Python distribution tools is going to do what you want, the exact way you want it. Even if you use distutils' data_files feature, you're still going to have to have your javascript files copied into your Python project directory (i.e., somewhere under the same directory as your setup.py.)
Given that, you might as well just copy the .js files to your package (i.e. alongside mongofulltextsearch/init.py) as part of your build process, and use package_data or include_package_data=True.
(Or alternatively, you could possibly use symlinks, externals, or some such, if your revision control system supports those. I believe that when building source distributions, the Python distribution tools convert symlinks to real files. At least, you could give that a try.)
Our project has more than 300 JSP files and more than 200 JavaScript files. I'd like to do some cleanup, removing unnecessary JS files. Even if the JSP includes the JS maybe none of the functions are used. The goal is to reduce both complexity and time needed to load the page. My IDE is Eclipse. Giving the dynamic nature of JavaScript I guess it will be hard or even impossible.
If it's conceivable that the application can be tested with a lot of coverage (i.e. going through every dialog, error message, and situation imaginable) you may be able to work with your access log files - compare the list of JS files to those fetched after period x of heavy use.
An alternative implementation of this would be setting up a "honeypot" (see my answer to this question).
Both these methods are of course "soft" in that their quality relies in how throroughly the application is actually used during testing time.
If you have any way of grepping all script references, that would be preferable. Maybe you can do a global search on {anything}.js, that would match most ways how to embed a JS file.
To find out what functions and javascript files are used in a project, you need code coverage tools, like JSCoverage or Code coverage for Firebug. These tools will return the functions used and the files used. Using these with an automated test suit like the Selenium or randomized testing should give you a fairly good idea which files are loaded.
If the files are loaded dynamically, you can also use Firebug or Fiddler to log the requests for the JS files.
Unfortunately if you want certainty, not just extremely high likeliness that you get with the above tools, you would have to generate a calling graph for your entire webapp, maybe using a Javascript Compiler, like Rhino...
These days I find myself shifting out more and more work to the client side and hence my JS files tend to get bigger and bigger. I have come to the point where most HTML pages have half a dozen or more JS imports in the header and I realised that this is hurting loading times.
I have recently discovered this script which lets you download several JS files with one HTTP request. It is written in PHP and being a Django fan I'm planning to rewrite it in Python. I'm planning to use a HTTP redirect to the pre-minified and concatenated file and was wondering what the cost of a 301 would be. Please let me know if that is a stupid idea.
On the other hand, am little worried about introducing scripting logic into the serving of static files and I was wondering if there is a viable build alternative like, say, an ant task that concatenates and minifies JS files and replaces multiple JS downloads in a HTML header with one big one, like the script does.
For PHP I certainly favour doing it dynamically just because if you introduce a build step you're losing one of the main benefits of using PHP. In fact, at the risk of self-promotion I've written Supercharging Javascript in PHP about this very issue.
Of course other technologies may vary.
Again it is PHP but it's not just a lump of code for you to use (although you can jump straight to Part 6 if you just want some fully working code) and may have value to you in terms of identifying the issues and doing things the right way and why you do them that way.
I favour having bundles of Javascript files (maybe only one for the entire application) and then each page simply activates the behaviour it needs through standard means but all the code bodies are in the larger cached and minified JS file. It works out fastest this way and is a good way to go.
If you do want it as part of a build process, which is a reasonable solution if you have a build process anyway, then I suggest you minify your code. There are lots of tools to do this. Have a look at YUI Compressor.
If you do a static combine of JS files, the other stuff mentioned above such as gzipping and associated issues is still relevant.
YUI compressor is a good choice. If you want to learn how to set up an Ant-based build process, have a look at this Tutorial: http://www.javascriptr.com/2009/07/21/setting-up-a-javascript-build-process/
As a Ruby-based alternative, I would recommend Sprockets