Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
User avatar
oss
Site Admin
Posts: 5890
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

[ADMIN] XML-RPC (API) temporary off

Tue May 19, 2009 4:08 am

Hello,

for couple of days we have to shut down XML-RPC. It is caused by overloaded servers (on 3 webservers we got now is load around 20), so neither website and XMLRPC (which uses EVERY program) was down.

I know it is really BAD to do this, but better, than doing nothing and have both services down. New server will be online (at least I hope) this week (till 22.5. - so weekend should be better).

I am really sorry for this, but we are working on solution. Expenses for hosting cost us EVERY month around 1000 EUR.
Last edited by oss on Fri Aug 21, 2009 2:59 pm, edited 1 time in total.

User avatar
eduo
Posts: 716
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website Yahoo Messenger

Wed May 20, 2009 12:57 am

Ehrm. 1000 euro per month? Really? This number must be wrong. There's no amount of advertising that would pay for this.
http://eduo.info/
[url=http://eduo.info/soleol/]OpenSubtitles from your desktop: SolEol for Mac/Windows/Linux[/url]
[url=http://forums.plexapp.com/index.php?showtopic=325&st=0&p=2480&#entry2480]My current episode processing work flow[/url].

Christofer
Posts: 7
Joined: Tue Apr 01, 2008 5:25 pm

Wed May 20, 2009 12:08 pm

There's no amount of advertising that would pay for this.
Sure there is. I believe the current ads earn os more than 1000€ per month. I would guess the ads generate a cpm rate of $0.50 and the site gets 200 000 pageviews a day. So I think os does alright and rightfully so. :)

User avatar
oss
Site Admin
Posts: 5890
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Wed May 20, 2009 12:56 pm

Omerta, how do you know download counter is not working ? If I look at newest subtitles, then I click to page 2 and 3 and so on, I can see, there are some downloads of subtitles.

When is XMLRPC down, you can see huge difference how many subtitles are downloaded using programs, look:

http://www.opensubtitles.org/en/statistics

so, we can say, when XMLRPC and site is working ok, there is 300.000 downloads per day, when it is off, there is 40.000 downloads per day. Huge difference, right?

eduo: 1000 EUR per month, right. So now you know, why there is popup, and other "shit" ads. With google ads I can easily pay for hosting, coz they was performing the best, but google banned opensubtitles, so I have to switch for those "flashy, shitty" ads, which nobody likes.

Christofer, wrong. With 0.5 USD CPM you can count on US traffic, not with international traffic like OS has.

User avatar
eduo
Posts: 716
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website Yahoo Messenger

Wed May 20, 2009 1:37 pm

Omerta, how do you know download counter is not working ? If I look at newest subtitles, then I click to page 2 and 3 and so on, I can see, there are some downloads of subtitles.

When is XMLRPC down, you can see huge difference how many subtitles are downloaded using programs, look:

http://www.opensubtitles.org/en/statistics

so, we can say, when XMLRPC and site is working ok, there is 300.000 downloads per day, when it is off, there is 40.000 downloads per day. Huge difference, right?

eduo: 1000 EUR per month, right. So now you know, why there is popup, and other "shit" ads. With google ads I can easily pay for hosting, coz they was performing the best, but google banned opensubtitles, so I have to switch for those "flashy, shitty" ads, which nobody likes.

Christofer, wrong. With 0.5 USD CPM you can count on US traffic, not with international traffic like OS has.
OS: Please. I've always said I didn't like ads but I don't insult OS or call them "shit" (I think) except for questionable sites. I have always understood the website needed to break even and, if possible, make a profit (I'm OK with OS making a profit, really).

If you tell me (privately, if you want) the details of the current hosting plan and current usage you have I can see about getting you a better deal. 1000EUR a month is too much even for OpenSubtitles. There's something that doesn't match there and maybe you're being ripped off (I'm not implying you haven't researched already, but I can't know).

There is also the matter of trying to balance the load across different servers as we've talked about in the past.

If Google didn't provide a reason for their banning it could have done with the rest of the ads. Google is very picky about who they share ad space with and tend to ban sites that put too much advertising from different places.
http://eduo.info/
[url=http://eduo.info/soleol/]OpenSubtitles from your desktop: SolEol for Mac/Windows/Linux[/url]
[url=http://forums.plexapp.com/index.php?showtopic=325&st=0&p=2480&#entry2480]My current episode processing work flow[/url].

User avatar
oss
Site Admin
Posts: 5890
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Wed May 20, 2009 1:44 pm

personally I hate advertisement, I am happy you and maybe some other users understand, why we have this evil online - it is needed to run such a site.

For hosting - yes, we had servers in hetzner - they are good and cheap, but problem was - they received some letters from some companies (same like piratebay), and they just tell us to go away. That's the problem with most hosting companies, so it is not only the price, i hope you got me...

I am not saying we are hosting in the cheapest company, but at least, I am happy (and specially Danger), we dont have to move our servers every 3 months.

User avatar
eduo
Posts: 716
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website Yahoo Messenger

Wed May 20, 2009 1:58 pm

personally I hate advertisement, I am happy you and maybe some other users understand, why we have this evil online - it is needed to run such a site.

For hosting - yes, we had servers in hetzner - they are good and cheap, but problem was - they received some letters from some companies (same like piratebay), and they just tell us to go away. That's the problem with most hosting companies, so it is not only the price, i hope you got me...

I am not saying we are hosting in the cheapest company, but at least, I am happy (and specially Danger), we dont have to move our servers every 3 months.
Yes, understood. But if can't be supported there's no upside to what it costs.

You could try to give S3 a go, keeping a local copy of all subtitles and sending them all to S3 to be served could be useful, and it just implies a change in the URL of the downloadable file (you would be forced to use static subtitle files, instead of generating them on the fly, though). Same for posters.

Some parts of the redesigned API could be made static and placed in S3 as well, which could help.
http://eduo.info/
[url=http://eduo.info/soleol/]OpenSubtitles from your desktop: SolEol for Mac/Windows/Linux[/url]
[url=http://forums.plexapp.com/index.php?showtopic=325&st=0&p=2480&#entry2480]My current episode processing work flow[/url].

User avatar
oss
Site Admin
Posts: 5890
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Wed May 20, 2009 2:10 pm

I heard about S3. For sure I can give a try, problem is with content, they don't allow (I dont have to read their policies) copyrighted stuff...

...of course we can talk about if subtitles are copyrighted and so on, but these big companies just don't care...same as google - they just write it is like Grey zone, copyright violation and...you know the rest.

User avatar
eduo
Posts: 716
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website Yahoo Messenger

Wed May 20, 2009 4:09 pm

I heard about S3. For sure I can give a try, problem is with content, they don't allow (I dont have to read their policies) copyrighted stuff...

...of course we can talk about if subtitles are copyrighted and so on, but these big companies just don't care...same as google - they just write it is like Grey zone, copyright violation and...you know the rest.
I know. That's the reason I suggested having always a local copy you can fall back on :)

http://www.labnol.org/internet/lower-am ... time/5193/
http://cardbox.wordpress.com/category/amazon-s3/

Every static file from the site can be moved there and it's probably cheaper to serve it than locally. You can also alter their cache settings so it's even cheaper.

You could even get creative with 301 redirects if you don't want to directly link the S3 files for some reason (like obfuscating).
http://eduo.info/
[url=http://eduo.info/soleol/]OpenSubtitles from your desktop: SolEol for Mac/Windows/Linux[/url]
[url=http://forums.plexapp.com/index.php?showtopic=325&st=0&p=2480&#entry2480]My current episode processing work flow[/url].

JPDeckers
Posts: 4
Joined: Wed May 20, 2009 5:44 pm

Wed May 20, 2009 5:56 pm

I heard about S3. For sure I can give a try, problem is with content, they don't allow (I dont have to read their policies) copyrighted stuff...
What volume of traffic are we talking about ?
Given that you currently ARE paying already, there are more options to distribute the content via CDNs, for even lower prices than S3. Thing is though that the volume should be reasonably large.

Another idea (not hindered by any knowledge at all for the XML-RPC setup): is it possible to do gzip compression on the HTTP-requests (or is this already implemented?) Might make (quite) a difference.

User avatar
eduo
Posts: 716
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website Yahoo Messenger

Wed May 20, 2009 8:13 pm

I heard about S3. For sure I can give a try, problem is with content, they don't allow (I dont have to read their policies) copyrighted stuff...
What volume of traffic are we talking about ?
Given that you currently ARE paying already, there are more options to distribute the content via CDNs, for even lower prices than S3. Thing is though that the volume should be reasonably large.

Another idea (not hindered by any knowledge at all for the XML-RPC setup): is it possible to do gzip compression on the HTTP-requests (or is this already implemented?) Might make (quite) a difference.
Gzip is implemented. I found out the hard way when gunzipping error pages thinking they were subtitles :)

The problem with opensubtitles is not so much volume in total but the sheer amount of smaller files that kills it. Since every service out there charges both by size and by GET request it would be a matter of choosing the least expensive.

Still, S3, S3 + Cloudfront (a CDN-like service) or a proper CDN would all be cheaper than serving the files directly. And it has a guarantee of scaling as needed the local hosting used can't provide.

I'd recommend moving to S3 first, which can be used to test the waters and has no minimums. Combined with cache expiration trickery it might be a good choice to offload a lot of bandwidth hogging.

Incidentally, as for traffic, we're talking roughly a third of a million subtitle requests a day, with each being 10 to 25K in gzipped form. The problem is the amount of requests, not the total volume transferred (which is a problem, but seconday).

That would leave the database problems to be tackled with, but that is a matter of rethinking the architecture and can be done step by step.
http://eduo.info/
[url=http://eduo.info/soleol/]OpenSubtitles from your desktop: SolEol for Mac/Windows/Linux[/url]
[url=http://forums.plexapp.com/index.php?showtopic=325&st=0&p=2480&#entry2480]My current episode processing work flow[/url].

garyrobertdavidson
Posts: 1
Joined: Wed May 20, 2009 11:57 pm

Some tips..

Thu May 21, 2009 12:43 am

After missing the cool functions of OS I have decided to at least give a helping hand..

Firstly ..
OS has alway been my absolute favorite web page to download subtitles for all my naughty tv shows and films

The API function is great i dont even want to try any others, I use subdownloader and not the page itself to download all my needs..

Problems
It seems that me and the other users downloading using programs are killing your servers.

To understand more
Im not sure how everything is done internally, I mean if you save all subs in a database or if you have them file based.

Solutions, or some ideas at midnight
1) Completly split the DB from the subs, in the DB you just have the various names and other info which links to hash name or something the hash values are the file names

Basically saving the Subs just using the HashValue is the key, that way most api programs can grab the file without needing to hammer the DB

2) Save all subs zipped in a file system, get people who dont want to donate money to simply donate some web space, 10-100mb will do for lots of subs

3) if all files are saved locally then password protect the directory and change the password every day/week/month.. this will at least make sure each user has to load a web page (with adverts if you want, i wouldnt mind) and read out the new password. It will also stop the Fully Automated users from asking for a sub every 5mins all year round which doesnt exist..

4) Let google cache everything, I think this has been suggested before, and I'm not too sure if it would work that well but google caches most web site pretty well so why not try changing ROBOTS.TXT too go through the latest subs, the programs using api can be then changed to try google cache first then OS

5) (Crazy Idea) Compress the subs data into Jpegs the upload them to picasa or imageshack.. (did i say it was a crazy idea) - but maybe some other way to use someone else's traffic

6) Torrent all subs... Most of the subs are for stuff downloaded as torrents so why not torrent the subs... maybe a vuze plugin could be written to autodownload subs has a torrent or when not found download from OS and automatically seed it as a torrent..

Ok there are some ideas there how I may do it. They may all sound stupid but im just trying to offer something for the Service OS provides..

PS. Maybe Im wrong with my assumption that its the APIs and there programs which are killing OS, if you fill me in with some more info on whats going wrong ill be glad to help (without my checkbook,sorry)[/b]

User avatar
eduo
Posts: 716
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website Yahoo Messenger

Thu May 21, 2009 1:09 am

The API is indirectly killing OS, but it's the DB access behind it the real reason. The API just is so massive the the site slashdots itself.

I was talking about this with OS just today. I realize reading this and comparing with my own assumption that we severely underestimate the size an traffic involved in OS.

Some stats that have been mentioned around:
-Half a million daily subtitle downloads.
-A third of a daily million pageviews.
-40 thousand subs downloaded from the web page (no massive download)
-Over 10GB of Subs in individually compressed files.

P2P has been mentioned before. I still believe it's a non-item and makes no sense. We're talking millions of individual files, with thousands more every day, each less than 50k in size.

What I've proposed is to have part of the API be file-dependant instead of DB-dependant, something you've touched upon here. Also: To put them all in Amazon S3 or a CDN.

byhash/moviehash-bytesize/iso-639/
--> list.xml

byimdibd/iso-639/
--> list.xml

But without knowing the plan for the rewrite all of this is just that, speculation :) OpenSubtitles is not Open, so we can only give ideas and offer help where possible.

Incidentally: I have invested a ton of effort and time in *selling* OpenSubtitles as a subtitle repository but I won't hide the fact that, to me, the selling point is the API and hash-based searches. Without those OpenSubtitles is one of the biggest subtitle repositories, but not much difference outside of that. The differentiator is the API and as such all my suggestions are aimed at making the API feasible and cost-effective. If I offer suggestions for the web itself it's just because I understand *something* needs to pay for all this bandwidth and CPU and web ads is the only thing I'm willing to accept, rather than advertising in the API or, god forbid, in the subtitles themselves.
http://eduo.info/
[url=http://eduo.info/soleol/]OpenSubtitles from your desktop: SolEol for Mac/Windows/Linux[/url]
[url=http://forums.plexapp.com/index.php?showtopic=325&st=0&p=2480&#entry2480]My current episode processing work flow[/url].

macofaco
Posts: 68
Joined: Mon Sep 22, 2008 8:31 pm
Contact: Website

Thu May 21, 2009 6:38 am

I think major redesign is needed rather then buying new servers...

JPDeckers
Posts: 4
Joined: Wed May 20, 2009 5:44 pm

Thu May 21, 2009 7:11 am

I think major redesign is needed rather then buying new servers...
While this seems to be true to some extent, reading back on eduo's post the problem seems to be in the sheer volume of requests that simply kills the servers.
Some stats that have been mentioned around:
-Half a million daily subtitle downloads.
-A third of a daily million pageviews.
-40 thousand subs downloaded from the web page (no massive download)
-Over 10GB of Subs in individually compressed files.
Thanks for this info!

I used to manage a site w/ TB's of traffic every day, and by that time had tens of servers handling the requests. Then switched to a CDN, which reduced load dramatically on the servers, and allowed us to push Gbps of data suddenly (read 10's of TB's/day).

Looking at the O.S. numbers, I get:
30*500k = 15m subtitle downloads/month
10 GB storage space
w/ 25kB per subtitle, traffic is roughly 360 GB.
This translated to a few Mbps peak usage.

Unless these numbers are totally off, I do not see this costing "1000 eur/month", so there must be some error in these numbers / some cheaper hosting solutions to find :)
Most likely the costs are in the amount of servers invloved in a non-competitive environment, which should be able to be brought down by switching to a CDN.

While S3 is charging per request, it is in the range of 1ct/10.000 GET requests, thus for subtitle downloads it would cost (15m/10.000/100) = $15/month on GET requests. So switching to a CDN will certainly help reducing cost/load/everything.

Problem might/will be the gray area the subtitles are in, so I would recommend going to a 3rd party CDN like panther express/limelight c.s. which don't cut you off (discuss during contract negotiations), so find one that has low commitments and acceptable pricing.

Just my $0.02 :)

Return to “Programs using OS”

Who is online

Users browsing this forum: No registered users and 31 guests