Having trouble searching on this. The only thing I've found so far was related to a mapping API where (for map markers) JavaScript bogs down badly around 500 IDs. I'm pretty sure that was related to the particular scripts in use though and not a general rule.
I have a page with a complicated list. Each item in the list has about 10 distinct IDs on it and there's several JavaScript scripts in play. The list is paginating and the user has a choice of how many items to show per page. Considering each item in the list has ~10 IDs, I'm wondering what I should set the maximum number of items per page to (total number of records the user can choose to view at one time on a page).
I mean - aside from increased load times based on the raw number of records, is there a known number of id's where a limit is hit or where CSS/JavaScript performance starts to degrade sharply?
Edit: Pardon, each 'item' is complex business card (to paint a picture) with three divs in play and a small form. That's whay each item (record) has about 10 id's.
I think you will find that it will ultimately depend on the user's browser and personal system.
What would probably make the most sense is to do some research on your intended user base, what browser and how much ram they have available. You can profile your application in your local environment (and there was already a discussion about doing just that). That would give you an idea of how many system resources you're using versus how much the user has available.
But given the amount of highly graphical javascript stuff going on with HTML5, ect., I would wager you're going to need to have a ridiculous amount of data displaying for the javascript performance to be your biggest sticking point (at least in the most modern browsers, if you have to support something like IE6, all bets are off).
Closed the parenthesis, sorry for the willie-giving
You didn't answer my comment and you've already accepted an answer, but I'll give an answer anyway, because I've got a feeling you may not need all the IDs you have.
Let's take following example code:
<div id="card-1" onclick="doSomethingWithCard('card-1')">
<h2 id="card-1-name">John Doe</h2>
</div>
Both IDs here can be unnecessary. Many people try to get a reference to an element, that triggered an event, by passing its ID to the event handler and using getElementById. Instead this already contains a reference:
<div onclick="doSomethingWithCard(this)">
...
</div>
And once you have that reference, you can use getElementsbyTagName, a CSS selector (usually with a JavaScript Framework/Library like jQuery) or XPath to access the elements inside instead of their separate IDs. This works best if all items have the same inner structur using the appropriate elements (h2, h3, ul, form, etc.) and classes.
jQuery example:
function doSomethingWithCard(card) {
alert($(card).find("h2").text());
}
Related
Challenge
When using puppeteer page.click('something') one of the first challenges is to make sure that the right 'something' is provided.
I guess this is a very common challenge, yet I did not find any simple way to achieve this.
What I tried so far
In Google Chrome I inspect the element that I want to click. I then get an extensive element description with a class and such. Based on an example I found, my approach is now:
Take the class
Replace all spaces with dots
Try
If it fails, check what is around this and add it as a prefix, for example one or two instances of button.
This does not exactly feel like it is the best way (and sometimes also fails, perhaps due to inaccuracies from my side).
One thing that I notice is that Chrome actually often seems to give a hint hovering over the thing I want to click, I am not sure if that is right but I also did not see a way to copy that (and it can be quite long).
If there is a totally different recommended way (e.g. Looking in the browser for what the name roughly is, and then using puppeteer to list all possible things), that is also fine. I just want to get the right input for page.click()
If you need an example of what I am trying: If you open this question in an incognito tab, you get options like share or follow. Or if you go to a web shop like staples and want to add something to cart.
When using puppeteer page.click('something') one of the first challenges is to make sure that the right 'something' is provided.
Just to be clear, "something" is a CSS selector, so your question seems to reduce to how to write CSS selectors that are accurate. Or, since Puppeteer offers XPath and traditional DOM traversals, we could extend it to include those selection tools as well.
Broader still, if there's a data goal we're interested in, often times there are other routes to get the data that don't involve touching the document at all.
I guess this is a very common challenge, yet I did not find any simple way to achieve this.
That's because there is no simple way to achieve this. It's like asking for the one baseball swing that hits all pitches. Web pages have messy, complex, arbitrary structures that follow thousands of different conventions (or no conventions at all). They can serve up a slightly or completely different page structure on any request. There's no silver-bullet strategy for writing good CSS selectors, and no step-by-step algorithm you can apply to universally "solve" the problem of accurately and robustly selecting elements.
Your goal should be to learn the toolkit and then practice on many different pages to develop an intuition for which tools and tricks work in which contexts and be able to correctly discern the tradeoffs in different approaches. Writing a full guide to this is out of scope, and articles exist elsewhere that cover this in depth, but here are a few high-level rules of thumb:
Look at context: consider the goals of your project, the general structure of the page and patterns on the page. Too many questions on Stack Overflow regarding CSS selectors (but also in general) omit context, which severely constrains the recommendation space, often leading to an XY problem. A few factors that are often relevant:
Whether the scrape is intended to be one-off or a long-running script that should try to anticipate and be resillient to page changes over time
Development time/cost/goal tradeoffs
Whether the data can be obtained by other means than the DOM, like accessing an API, pulling a JSON blob from a <script> tag, accessing a global variable on the window or intercepting a network response.
Considering nesting: is the element in a frame or shadow DOM?
Considering whole-page context: which patterns does the site tend to follow? Are there parent elements that are useful to selecting a child? (often, this is a distant relationship, not visible in a screenshot as provided by OP)
Consider all capabilities provided by your toolkit. For example, OP asked for a selector to close a modal on Stack Overflow; it turns out that none of the elements have particularly great CSS selectors, so using Puppeteer to trigger an Esc key press might be more robust.
Keep it simple: since pages can change at any time, the more constraints you add to the selector, the more likely one of those assumptions will no longer be true, introducing unnecessary points of failure.
Look for unique identifiers first: ids are usually unique on a page (some Google pages seem to scoff at this rule), so those are usually the best bets. For elements without an id, my next steps are typically:
Look for an id in a close parent element and use that, then select the child based on its next-most-unique identifier, usually a class name or combination tag name and attribute (like an input field with a name attribute, for example).
If there are few ids or none nearby, check whether the class name or attribute that is unique. If so, consider using that, likely coupled with a parent container class.
When selecting between class names, pay attention to those that seem temporary or stateful and might be added and removed dynamically. For example, a class of .highlighted-tab might disappear when the element isn't highlighted.
Prefer "bespoke" class names that seem tied to role or logic over generic library class names associated with styling (bootstrap, semantic UI, material UI, tailwind, etc).
Avoid the > operator which can be too rigid, unless you need precision to disambiguate a tree where no other identifiers are available.
Avoid sibling selectors unless unavoidable. Siblings often have more tenuous relationships than parents and children.
Avoid nth-child and nth-of type to the extent possibe. Lists are often reordered or may have fewer or more elements than you expect.
When using anything related to text, generally trim whitespace, ignore case and special characters where appropriate and prefer substrings over exact equality. On the other hand, don't be too loose. Usually, text content and values are weak targets but sometimes necessary.
Avoid pointless steps in a selector, like body > div#container > p > .target which should just be #container .target or #container p .target. body says almost nothing, > is too rigid, div isn't necessary since we have an id (if it changes to a span our new selector will still work), and the p is generic--there are probably no .targets outside of ps anyway.
Avoid browser-generated selectors. These are usually the worst of both worlds: highly vague and rigid at the same time. The goal is to be the opposite: accurate and specific, yet as flexible as possible.
Feel free to break rules as appropriate.
I would like to know what is the maximum data that angular framework can handle. Say, I am displaying a chart using angular and some charting framework like chartjs. I'd like to know up to how many data can the browser display properly, with slowness, or up to when it crashes.
Your question has no simple answer, but I will try to flatten it and give a simple answer, or at least simple things to consider...
Angular (at runtime), like many other frameworks is simply JavaScript,
So let us reduce the question to "Limitation of JavaScript and browsers with regards to data loaded",
JavaScript has no upper limit of memory or storage it can handle,
I've seen JavaScript applications that require more than 15GB of RAM,
and they performed well too.
So assume data size itself is not an issue (unless your application is poorly implemented, leaking memory or just not very efficient, of course).
The main challenge as I see it, is displaying and manipulating the information
without causing unnecessary delay or unresponsiveness.
Displaying the information - let's say you have a list (or a table) containing 1,000,000 possible gifts which you then want to display for the user to select.
Adding the list items to the document one by one is tempting, but will require the browser to repaint after each addition (causing a delay or full unresponsiveness until finished), another way is adding the elements to some DOM element (denoted by N) still being kept in memory, then adding all elements corresponding the list items to the element N (still, just an in memory operation), finally adding N to the document containing the entire list - the will be a much better solution for displaying the large amount of data.
Manipulating the information - displaying is indeed not enough. you would like to move, drag, sort and filter the data being displayed. And as mentioned before, it is a bad idea removing many elements directly from DOM. You should instead remove container from the document's DOM to memory, manipulate the data in it, and then add the container right back to the document. Angular does this kind of magic for you.
(Toggling the 'display:none\block' css attribute of many elements has a similar blocking effect as I recall).
A good practice is implementing an application/page showing only the amount of data that can be processed by a human at a single glance. The rest of it should be considered in the application data-layer, in memory, and should be loaded to display given the appropriate need or request.
To conclude, you can deal with huge amounts of data as long as you provide a mechanism that efficiently filter the displayed information.
I hope it helps...
for further reading:
Slow and fast ways of adding elements to DOM
A question emphasizes the lack of memory limit used by JS
CSS display attribute performance
A good discussion about the reasons for slow DOM
About using HTML5 correctly - old but still true
Once the DOM creation procedure is understood - it much easier to display data without affecting performance / user experience
Usually we search elements by selectors in libraries like jquery but what happens if we want to make the other way around: given an element we want to find all rules applied to it like firebug does it!
This kind of job is necessary if we want to bring a css file and update it so we can see live results in our webpage.
In firebug by selecting an element we can see in reverse order all the rules that were applied to it.
First, we have to find the document's style rules and retrieve their contents as a text so user can update them but the real problem is to find the relation between an element and all the rules....
Is this possible even with the help of a library like jquery?
I was looking for opinions about what tools can one use to tackle such a problem in the restrictions applied by a browser environment...
My opinion is that we have to make a brute search attack at the css files/selectors and construct the dependencies: suppose we have 100 rules for a page with 100 elements the possible combinations are 10^4 as the relations are many-to-many.
Then, we can build 'tables' in memory (hash arrays) that might excced 10^4 records if we want to keep the cascading order of rules for an element.
Anyway, my point is that I wouldn't dare to put jquery in such a pain! If we can 'transplant' the heart of jquery, it's search engine I mean.
It seems that it's heart listens to the name of 'sizzle' but it's a waste of time here (regular expressions? no thanks). I think the real 'heart' is a simple word: 'querySelectorAll' and now we are running with native speeds here...
just an opinion since I don't care about wide/old browser support.
I'm currently debugging a ajax chat that just endlessly fills the page with DOM-elements. If you have a chat going for like 3 hours you will end up with god nows how many thousands of DOM-nodes.
What are the problems related to extreme DOM Usage?
Is it possible that the UI becomes totally unresponsive (especially in Internet Explorer)?
(And related to this question is off course the solution, If there are any other solutions other than manual garbage collection and removal of dom nodes.)
Most modern browser should be able to deal pretty well with huge DOM trees. And "most" usually doesn't include IE.
So yes, your browser can become unresponsive (because it needs too much RAM -> swapping) or because it's renderer is just overwhelmed.
The standard solution is to drop elements, say after the page has 10'000 lines worth of chat. Even 100'000 lines shouldn't be a big problem. But I'd start to feel uneasy for numbers much larger than that (say millions of lines).
[EDIT] Another problem is memory leaks. Even though JS uses garbage collection, if you make a mistake in your code and keep references to deleted DOM elements in global variables (or objects references from a global variable), you can run out of memory even though the page itself contains only a few thousand elements.
Just having lots of DOM nodes shouldn't be much of an issue (unless the client is short on RAM); however, manipulating lots of DOM nodes will be pretty slow. For example, looping through a group of elements and changing the background color of each is fine if you're doing this to 100 elements, but may take a while if you're doing it on 100,000. Also, some old browsers have problems when working with a huge DOM tree--for example, scrolling through a table with hundreds of thousands of rows may be unacceptably slow.
A good solution to this is to buffer the view. Basically, you only show the elements that are visible on the screen at any given moment, and when the user scrolls, you remove the elements that get hidden, and show the ones that get revealed. This way, the number of DOM nodes in the tree is relatively constant, but you don't really lose anything.
Another similar solution to this is to implement a cap on the number of messages that are shown at any given time. This way, any messages past, say, 100 get removed, and to see them you need to click a button or link that shows more. This is sort of what Facebook does with their profiles, if you need a reference.
Problems with extreme DOM usage can boil down to performance. DOM scripting is very expensive, so constantly accessing and manipulating the DOM can result in a poor performance (and user experience), particularly when the number of elements becomes very large.
Consider HTML collections such as document.getElementsByTagName('div'), for example. This is a query against the document and it will be reexecuted every time up-to-date information is required, such as the collection's length. This could lead to inefficiencies. The worst cases will occur when accessing and manipulating collections inside loops.
There are many considerations and examples, but like anything it depends on the application.
I'm often conflicted about how to approach this problem in my applications. I've used any number of options including:
A generic multiselect - This is my least favorite and most rarely used option. I find the usability to be atrocious, a simple mis-click can screw up all your hard work.
An "autocomplete" solution - Downside: user must have spelling abilities to find the damn values they need, aren't exposed to ones they may not have in mind, and the potential backend performance of substring searching.
Two adjacent multiselects, with an add/remove button - Downsides: still "ugly" imo
Any number of fancy javascript solutions (http://livepipe.net/control/selectmultiple, http://loopj.com/2009/04/25/jquery-plugin-tokenizing-autocomplete-text-entry/, etc.)
I haven't been able to find any usability studies done on the best approach to this problem. Many of these alternate solutions are great when you're going from <10 elements to a hundred, but may break down completely when you are going from a hundred to a thousand.
What do you guys use? Why do you use it? Can you point me to usability case studies? Is there a "magic" solution that has yet to be discovered?
My advice is don't use generic multiple select controls. I've been running User Experience Research for my whole career, and every single time I test a site with multiple select controls, it always turns out to cause problems for end users.
I did a post on this a while back: Multiple Select Controls Must Evolve or Die
Sounds like you knew this anyway, though. Your real question is "what do I use instead?" Well, to answer this question you have to work out whether the user's task leans towards recall or recognition.
(i) By recall, I mean the user knows what they want to pick before they have even seen the list. In this case, it's probably easiest for them if you offer an autocomplete tool (as used very effectively on facebook, for example). This solution is even better when the list of options is also impossibly long to present on a page (e.g. location names, etc).
(ii) Moving on to recognition - by this I mean a task that involves the user not knowing what they want to pick until they've seen the list of options. In this case, autocomplete doesn't give them any hints. An array of checkboxes would be much more helpful. If you can show them all at once, this is helpful to the user. Scrolling DIVs are more compact but they create a memory load for the user - i.e. once they've scrolled down, they have to remember which items they picked. This is particularly evident when users save a form and come back to it later.
So - thinking about your problem, do you need a solution for recall or recognition?
I can't point you to any case studies, unfortunately, but what I can tell you is that I personally prefer large checkbox arrays in two-to-five column layouts. Sure, they take up a lot of space, but they are extremely precise and uncomplicated.
I think for any control - be it basic multiselect, double list, checkbox array, or any other solution - once you go over a certain threshold of items it's going to be challenging for the user no matter what.
Have a look at Dojo Toolkit's DataGrid control. It's by far the most flexible and powerful, and supports multiple row selections. It also has accessibility features built in.
The ExtJS library has some really good solutions for your issue. There a bunch of user extensions for neat-o combos and multi select boxes.
If you want a combo select list, you can add query searching and pagination, plus design the resulting drop-down using easy templating, like in this example:
http://extjs.com/deploy/dev/examples/form/forum-search.html
Here is a nice multiselect, in the style you seemed to describe:
*(main_site)/learn/Extension:Multiselect2
You can find all user extensions here:
*(main_site)/learn/Ext_Extensions
Plus, you can easily include it into an existing web page without alot of extra overhead. ExtJS's full stack is quite large, but to get only the JS files you need, they provide a nice builder tool to grab just those parts you need:
*(main_site)/products/extjs/build/
Just a warning: ExtJS has just released 3.0, but I'm not sure the user extensions have been upgraded. The "forum-search" was taken from a 3.0 example though, so it will work just fine with the latest and greatest.
(*) Apparently new users can only post one link...