Topic: MODx limits on the number of documents  (Read 37315 times)

Pages: 1 2 3 [4] 5   Go Down

#61: 10-Dec-2008, 02:13 PM


cbaone
Posts: 563

Utah Web Design

WWW
Thanks, Jason. I appreciate your time. It's just as I thought. I guess I was over thinking it. Thanks.  Smiley
The MODx has you...
Utah Web Design

#62: 13-Dec-2008, 08:00 AM

Testers

ganeshXL
Posts: 2,015

true is true

WWW
I did a little test...

I created 10000 pages on my local WAMP setup (0.9.6.3 RC2). I imported random text as articles with a little php script (sliced up a book from gutenberg.org, inserted docs with the docmanager class). Those pages are all in one single modx container.

Then I created a simple Ditto pagination nav for these documents. I used the performance / speed tags at the bottom of the page to see how much time it takes for rendering. Average was around 3.1 seconds (most of it for mySQL alone).

I expected something like that, so I just rewrote the pagination stuff in plain php/mysql. Now mysql takes... 0.0044 s.

Is the above a real-life scenario? I don't think so. Having 10k documents in one single container is insane. Is this a scientific test? Hardly. Am I knocking down Ditto? Absolutely not.

Basically, I can just repeat what has been said about scenarious where you need to serve thousands of pages: you need to optimize bits and pieces yourself. But that's probably the case with every CMS.

The devil is often in the details: e.g. if you write a custom snippet and you're creating links: You loop through a mySQL result-set.... You could very well use $modx->makeUrl() to get the URL. However, if you know that you're using SEO-friendly URLs, and you know how they're constructed (alias with or without suffix etc.), you could omit this API call and simply concatenate pagetitle + suffix vars. Because I'm sure this API function is internally using again mySQL at least once. // edit: wrong wrong wrong... see OpenGeeks comments below!

Also, if you just use "previous" and "next" links, and don't need a list of all the page-listings in-between, just drop it.
« Last Edit: 14-Dec-2008, 11:27 AM by ganeshXL »

#63: 13-Dec-2008, 09:34 AM


mrhaw
Posts: 1,918

modx == freedom

WWW
Super cool info!!   Cool

Did you try using the doc tree in the manager?
My playground: http://4up2date.info | Twitter: mrhaw
---> Check out: ReadSpeaker webReader Plugin | Support/Comments Thread

--=[ MR. HAW ]=--

#64: 13-Dec-2008, 05:27 PM

Foundation

OpenGeek
MODx Co-Founder
Posts: 6,934

damn accurate caricatures...

WWW
The devil is often in the details: e.g. if you write a custom snippet and you're creating links: You loop through a mySQL result-set.... You could very well use $modx->makeUrl() to get the URL. However, if you know that you're using SEO-friendly URLs, and you know how they're constructed (alias with or without suffix etc.), you could omit this API call and simply concatenate pagetitle + suffix vars. Because I'm sure this API function is internally using again mySQL at least once.
Well, actually that function does not use the database at all.  In fact, in order to avoid the MySQL calls when using makeUrl(), MODx caches all of the document entries along with needed information to construct the URLs in the siteCache.idx.php.  And this brings us back to the original problem, which is that loading 10000 documents via siteCache on every request, in order to more quickly be able to generate URL's dynamically at runtime, becomes less efficient than calling MySQL on demand, as you typically would not be creating URLs to all 10000 documents on every page.

Personally, I always run Ditto cached myself, opting to only have it regenerate when I edit the site.  A caching strategy has to be considered when developing dynamic sites with 1000's of documents; there is no way around it.  This is why there is even more granularity and flexibility in managing caching options in Revolution, so custom caching strategies can be easily implemented and modified as a site grows.
Jason Coward
MODx Co-Founder
xPDO Founder
CTO @ Collabpad
work productively.
work intelligently.
work together.
Light is just a vibration of a note too. Everything is. You've got to keep that in mind.
  Frank Zappa

#65: 17-Dec-2008, 01:37 AM

worchyld
Posts: 354

I love MODx!

We use 1 instance of MODX to handle many sites, and unfortunately its become incredibly bloated and very slow.

Whilst most of it could be down to simple HTML/JS/CSS issues, or server caching/optimization, I do have a sense that its MODx that is causing a majority of the slow-down.

Currently we have over 11500+ pages (not including chunks, snippets, plugins, etc) and of course we're handling 3 sites with 1 instance of MODX...

Anyway, whilst we're all waiting for MODX2 to come out, I'm thinking about a move to the Zend Framework, or at the very least using Zend to look at the MODX database and do stuff, that way I don't have to do much.

On this thread there is a discussion about Zend and MODX.  My question is, they keep mentioning "Tattoo", I did a search but I couldn't find anything.  What is "Tattoo" and how does it relate to MODX.

http://modxcms.com/forums/index.php/topic,3203.0.html

Finally has anyone used the ZF to extend MODX so that you use ZF instead of MODx?

#66: 17-Dec-2008, 06:17 AM

Foundation

rthrash
Posts: 11,348

WWW
11700 docs ... wow, would never recommend putting all of that into a single modx instance. I don't think the Zend Framework will alleviate anything either. The slowdown is coming from the cache files. How big are  your two .idx files? MODx has to parse the main .idx file on every request.

Tattoo = 0.9.7 = Revolution (the last name is sticking)
MODx is a content managmeent framework that allows web professionals to turn over sites to end-users for daily maintenance without worrying. Please help us help you when asking for assistance and read the wiki. Searching the forums from the top level helps, too.
Ryan Thrash
MODx Co-Founder
Principal @ Collabpad
work productively.
work intelligently.
work together.

#67: 9-Jan-2009, 07:06 AM


Eol
Posts: 281

Do unpublished pages count toward the 'maximum' page limit?
They do not increase the cache file size, but they may have other performance affects since they are additional rows in the database.

Interesting... so one could probably maintain global cutoff date for content and display archive of older documents "synthetically" (Ditto have parameter to aggregate unpublished documents). I need to think it through because we just inherited site with cca. 2800 articles.

#68: 24-Jan-2009, 07:57 PM


cbaone
Posts: 563

Utah Web Design

WWW
Jason,
When using the wrapper snippet with Ditto/Reflect I get errors. Well, more specifically, when using reflect to show the archive I get "The Ditto object is invalid. Please check it." error (Line ~304 of the reflect snippet). I know the include snippet is working as it should, but what needs to be changed in ditto/reflect to accommodate the way it's called in the wrapper snippet? Here's my call:

Code:
[!Includes? &file=`reflect/reflect.snippet.php` &return=`1` !]
« Last Edit: 24-Jan-2009, 08:32 PM by cbaone »
The MODx has you...
Utah Web Design

#69: 26-Jan-2009, 04:56 AM

Foundation

rthrash
Posts: 11,348

WWW
Try stripping the &return=`1` param and see what happens. Also you might want to pass in the additional params normally needed by those snippets.
MODx is a content managmeent framework that allows web professionals to turn over sites to end-users for daily maintenance without worrying. Please help us help you when asking for assistance and read the wiki. Searching the forums from the top level helps, too.
Ryan Thrash
MODx Co-Founder
Principal @ Collabpad
work productively.
work intelligently.
work together.

#70: 26-Jan-2009, 10:01 AM


cbaone
Posts: 563

Utah Web Design

WWW
No dice, I had already stripped it out once. I had the other parameters in there too I just didn't post them because there were so many.
The MODx has you...
Utah Web Design

#71: 26-Jan-2009, 05:48 PM


ChuckTrukk
Posts: 851

WWW
reflect uses runSnippet(ditto, params).

but ditto isnt in the modx db anymore. So you'd need to have reflect do runSnippet(includes, dittoparams+includeparams).


Chuck the Trukk
ProWebscape.com :: Nashville-WebDesign.com
- - - - - - - -
What are TV's? Here's some info below.
http://modxcms.com/forums/index.php/topic,21081.msg159009.html#msg1590091
http://modxcms.com/forums/index.php/topic,14957.msg97008.html#msg97008

#72: 31-Jan-2009, 07:03 AM

jazzy
Posts: 15

Just change the snippet call to 'include' and add the file and return parameters; all the other parameters for the snippet call would be the same as if you were calling the snippet the old fashioned way:
Code:
[[include? &file=`assets/snippets/wayfinder/wayfinder.snippet.php` &return=`1` &startId=`0` &outerTpl=`MyWayfinderOuterTpl`]]

Very nifty piece of work indeed.  Although I am still new to all of this, I can see the merit of using this procedure.

The above example uses a wayfinder snippet call which I gather (I may be wrong) is standard in modx.  How about if a custom snippet in /assets/my_snippets/test.snippet.php is to be included, What other/different would be required?  I asked this based on the following, especially the "modified tag that adds an additional parameter" part:
Quote from: OpenGeek
...then simply put your snippets in external files and include them, replacing the snippet calls with a modified tag that adds an additional parameter identifying the file containing the snippet code.

I would not be surprised if the solution is quite trivial but I am still learning so don't have a laughing fit Smiley.

#73: 31-Jan-2009, 09:29 AM

Foundation

OpenGeek
MODx Co-Founder
Posts: 6,934

damn accurate caricatures...

WWW
Regular tag for Wayfinder:
Code:
[[Wayfinder? &startId=`0` &outerTpl=`MyWayfinderOuterTpl`]]
"modified tag that adds an additional parameter":
Code:
[[include? &file=`assets/snippets/wayfinder/wayfinder.snippet.php` &return=`1` &startId=`0` &outerTpl=`MyWayfinderOuterTpl`]]
Jason Coward
MODx Co-Founder
xPDO Founder
CTO @ Collabpad
work productively.
work intelligently.
work together.
Light is just a vibration of a note too. Everything is. You've got to keep that in mind.
  Frank Zappa

#74: 7-Feb-2009, 07:47 AM

Shekhar
Posts: 12

As per my understanding their is no problem in holding the document.
as id in modx_site_content is of  type int with size 11.

The only problem is how you organise your data.
You have to cache static document and and uncached the document with dynamic data. 
 

#75: 5-Apr-2009, 12:56 AM


peterbrown
Posts: 24

WWW
Hello All,

After installing MODx on a Linux/Apache site, I've spent a few days scouring the MODx site, the forums, the Wiki and many, many reviews found via Google, while trying to decide if I can recommend MODx to the higher ups in our publishing company. Forgive my lengthy post, but I'm trying to get some perspective on using MODx for large data sets. I've read this entire thread, by the way.

I'm very impressed with the MODx concepts. I like the extensible 'hooks' a LOT.

As some background, I wrote my own Perl/MySQL CMS, which I use at:
http://worldcommunity.com/journal/.
It has only around 300+ articles, so I'm safe for awhile, in terms of that 5,000 limit :-).

However, I used the very same software for the publishing company I work at for my day job, for 3 of their magazines, quite successfully. The CMS I wrote contained articles for the 3 mags, plus a number of micro-sites, with a total of about 15,000 articles. One mag had 7,000 articles. My program (WCN:SQL) wrote out all of the pages to static html files, and used "section pages" to display paginated lists of documents in each section, with Previous and Next links. The section pages were also static files. I never ran into a problem with maxing out, because the Perl code created the Next links using MySQL's limit clause, and then translated that into the appropriate filenames. It took about 1-2 minutes to write out about 7,000 files, which we did every time an editor added or updated a story.

Here's an example of a paginated section from my smaller site:
http://worldcommunity.com/pages/s.0001/t.p0001.html

That said, our publishing company has moved on to a so-called "enterprise" system that we all now officially hate (Ektron). The magazine that had 7,000 articles probably has 8,000 or 9,000 by now, and will continue to grow. Many of the 30+ mags in our new larger company have multiple posts each day, so things add up really quickly over the years.

Also, it would be nice to have one MODx install for 3 or 4 mags. It is of course possible to have one installation per magazine, but the mags belong to groups, and the way we have it set up now is that the editors from one group log in and can take care of the 3 or 4 mags in their groups. That would mean, however, that the numbers would jump from 7,000 to many more, based on the group.

So, I'm looking for something better and faster than Ektron. My old system won't quite cut it, in terms of desired features, (and it's a one-man endeavor, so it's not being considered :-).

Is there any clarity at this point, in terms of the real upper limits of MODx Revolution?

It seems to me that it's not only the caching issue, but Ditto itself that may have some problems. I've been searching for the code and explanation of how to display paginated pages of sections that have hundreds of articles (like my system did, see above), and it seems like Ditto is the snippet everyone uses, but some of the posts in this thread seemed to say that Ditto would choke on large numbers of records.

Is there an extension that just has the Previous and Next links, without the numbered links, that might be a lot quicker?

Besides the cache issue and Ditto, it seems that the Manager Tree might get unwieldy with 8 or 9 or 10,000 documents.
With Ektron (and with my Perl system), one can simply page through the documents, with Next links, and/or search for docs to edit, so it doesn't get unwieldy.

Finally, although this is slightly off topic (but will get back on topic really quickly), our magazines (and mine too) use "multi-section" pages, like this:

http://worldcommunity.com/pages/c.0001.html

to display a certain amount of links within a section, with multiple sections displayed on the page. The above link only displays one article in each section, but it could be 3 or 5 or whatever. The order of sections in the multi-section page (or "category page") can be set in the CMS.

Can MODx do that as well? Would that have to be done by placing one Ditto snippet for each section, by hand, on the category page? It would be nice if it was automatic, so that when one added a section to a larger category, the section displayed on the multi-section page. But I also saw higher up in this thread (and this IS on topic) that having multiple Ditto sections on a page with large numbers of articles makes the page very slow. So that's a problem. It seems that Ditto is parsing through all the records, or did I read that wrong?

Based on all of the above, do you think MODx Revolution will fit the bill?

Thanks for what looks like a great program!

Peter Brown
« Last Edit: 5-Apr-2009, 01:06 AM by peterbrown »

#76: 5-Apr-2009, 08:12 AM

Foundation

rthrash
Posts: 11,348

WWW
Hi Peter, I am confident MODx Revolution can fit the bill quite well with a few custom snippets. I'd like to speak with you offline about this though as there's a lot that can be covered quickly in a conversation that would take hours of typing back and forth. Please contact me via ryan -atmark- collabpad.com. Thanks!
MODx is a content managmeent framework that allows web professionals to turn over sites to end-users for daily maintenance without worrying. Please help us help you when asking for assistance and read the wiki. Searching the forums from the top level helps, too.
Ryan Thrash
MODx Co-Founder
Principal @ Collabpad
work productively.
work intelligently.
work together.

#77: 5-Apr-2009, 03:06 PM

Testers

ZAP
Posts: 1,619

I expect that Ryan and Jason et al could come up with a customized version of Revo that would do all of that, since Revo is not that far from being feature ready (it powers the new MODx site and many others). You could probably also make the current 0.9.6.x version work for you as long as you have a server with sufficient CPU to parse large cache files, minimize the site cache file as much as possible using the various tricks throughout these forums, and perhaps modify the document tree code in the Manager so as to avoid an overwhelming number of docs being displayed there. Personally I would think that given the scope of this project and the certainty that you'll be pushing up against the outer limits of the current version very soon, Revo would be the best choice.

Regarding your Ditto concern, I think that you'd only have issues with very large data sets when using uncached Ditto calls, but I expect that you could probably cache many of them and then this is a non-issue. I often don't use Ditto for very specific aggregate queries because I feel like it's too much tool for the job. If I find it's easy to just use a regular MySQL query or the MODx API to grab the data I need, I just do that.

The modular nature of MODx is really its strongest selling point for people like you who are comfortable extending it, since you don't have to be limited by the standard feature set.

EDIT: Ryan beat me to replying (somehow his reply didn't show up while I was typing mine; I guess I'd had that tab open a lot longer than I thought)...
« Last Edit: 5-Apr-2009, 07:42 PM by ZAP »
"Things are not what they appear to be; nor are they otherwise." - Buddha

"Well, gee, Buddha - that wasn't very helpful..." - ZAP

Useful MODx links: documentation | wiki  | forum guidelines  | bugs & requests  | info you should include with your post | commercial support options

#78: 5-Apr-2009, 07:04 PM


peterbrown
Posts: 24

WWW
Dear Ryan and Zap,

Thank you both very much for your responses.
As they say in Perl land (and maybe PHP land too), "there's more than one way to do it."

If custom snippets can make things work, that's good news.

Ryan, I'll email you off list.
My email is peterbrown@worldcommunity.com.

Best regards,

Peter

#79: 18-Jun-2009, 06:39 PM

nondotzero
Posts: 5

Has this document 'limit' been lifted in the RC1 release of Evolution? Or are there plans to resolve this issue for Evolution in the future?
Greetz!

#80: 18-Jun-2009, 06:45 PM

cipa
Posts: 918

I found this on the issue

http://svn.modxcms.com/jira/browse/MODX-446

MODX-446 is part of Evolution now.
current goal: 66 posts on modxrules.com
plugin: Template Rules (updated version of automaticTpl)
snippets: ParentParent
manager hack: Custom Manager Tree
plugin [in progress]: templateManager
Pages: 1 2 3 [4] 5   Go Up
0 Members and 1 Guest are viewing this topic.