LDR / Metadata Development Notes
I would like to see a clear line drawn between LDR/Metadata code and the packages it depends upon so that administrators have more flexibility in deciding how to run their sites and also to show that L might actually mean lightweight.
I would also like to see frequent minor updates and to have that process be quick and painless.
The Metadata server requires there be access to a MySQL server to store the metadata.
A Python interpreter, version 2.3 or better is required.
Globus and pyGlobus
Additionally, the following Python libraries must be installed: Version requirements?
- Nevow really?
A MySQL server and MySQLdb should be available as a standard package in any Linux distribution.
Globus and pyGlobus are in the VDT, which doesn't seem to support Solaris. We're probably looking at setting up a pacman cache for this.
The remaining Python libraries are available as python packages
(installed the standard way with
python setup.py install
--prefix=$METADATA_LOCATION). The Python Metadata library and
executables are also available as a standard Python package. Should we
also distribute them with pacman? Some of these are also requred for
LDR. How is that going to work? And what about Solaris?
Download and install the VDT. Set environment variables below, for quicker install. Make sure to read licensing first.
export VDTSETUP_AGREE_TO_LICENSES=y export VDTSETUP_EDG_CRL_UPDATE=n export VDTSETUP_ENABLE_GATEKEEPER=n export VDTSETUP_ENABLE_GLOBUS_ROTATE=y export VDTSETUP_ENABLE_GRIDFTP=y export VDTSETUP_ENABLE_GRIS=n export VDTSETUP_ENABLE_JM_CONDOR=n export VDTSETUP_GRIS_AUTH=n export VDTSETUP_INSTALL_CERTS=l
Install (order is somewhat important):
Get a service certificate.
Make sure your firewall has ports 8000, 8083, 8084 open. (?) Have a care about port 4040 -- either block it or turn off manhole. These are dumb port numbers. What should they be?
Database user permissions. (GRANT statments here)
$METADATA_LOCATION/etc/Metadata.ini. In particular
the full update cache path, the URL for locating the metadata
database, location of HTML resources.
Make sure paths are set properly in
export X509_USER_CERT=/opt/metadata/etc/usercert.pem export X509_USER_KEY=/opt/metadata/etc/userkey.pem export PYTHONPATH=/opt/metadata/lib/python2.4/site-packages/ export PATH=$PATH:/opt/metadata/bin export CONFIG=/opt/metadata/etc/Metadata.ini source /opt/vdt/setup.sh
These are from the document produced from a previous LDR planning meeting.
Each of these should be meditated and commented upon.
- New Metadata Schema
- This was done in the 0.7.x series.
- New Propagation Method
- This was done in the 0.7.x series.
- Speeding up Transfer of Small Files
- Smooth MySQL Upgrades
- Choice to use Existing MySQL Installation
- Red/Green Light Status Page
- This will appear in the Metadata Server in 0.8.0. Not yet in LDR.
- Multi-Site Scheduling
- More Flexible Ordering
- On-the-fly Configuration
- Rotating Log Files
- This was done in the 0.7.x series.
- Seperate Critical Message Log
- Data Discovery for LDR Publishing
- Was looked at but not yet implemented.
- LDRVerify to be Easily Extended
Nice But Not Required
- Percentage of Collection Gauge
- Display Data Transfer Rates
- Browsing of Available Data
- Archive Transfer Rate Information
- Meassure Aggregate Throughput of Transfers
- Scheduling Based on Historical Information
- Validity Checking at Both Ends of Transfer
- Tools for Finding Metadata Holes
- Tools for Fixing Metadata Holes
- Master/Slave MySQL
- LDR Level Authorization
- Leveraging BigBrother
- LSCdataFind to Influence Priority
Misc links of varying applicability:
Ben's email and Kevin's comments regarding priorities:
- Kevin, Brian, - - Here are my suggestions for LDR fix priorities, organized by catagory. - - *Performance* - PR Recommend Description - 53 High LDRTransfer single write thread performance bottleneck - If we ever want to do L0 files again, this will need to be - fixed. If python can't do it, use C with pthreads. The only issue with making this one high is making sure we have time for everything... I agree it's important, though. Will think about it. - 35 High LSCdataFind can't request an interval that is smaller than - a frame file. This *really* needs to be fixed. I suggest - a non-SQL-only solution, i.e., pre/post processing of the - query. Stuart has just indicated to me that Dan Kozak has a SQL fix for this? I had no idea, so I will contact him and find out.
Am I an idiot? Did we try this? I don't see how it wouldn't work.
- CREATE INDEX foo (gpsStart, gpsEnd);
- ... UNION SELECT ... WHERE ... ($start BETWEEN gpsStart, gpsEnd OR $end BETWEEN gpsStart, gpsEnd)...
- 104 High Number of files queued per collection should be definable. - In light of recent problems with SFT transfers, this - should be a high priority. Seems easy to do at first - glance too. Ok. - *Already Fixed* - PR Recommend Description - 85 Drop expose multiple localhost PFNs based on sysadmin prefs. - This is already fixed in LDRdataFindServer Thanks, that's right. - 62 Drop Doesn't it already check file sizes after transfer? It's not 'standard' like md5sums are... I think someone just added it to the storage modules at CIT or the sites (or both.. I forget where I saw it). So I'll keep this open until it's better integrated. - *Added* - I've added a couple of LDRSchedule PRs to the LDR gnats. I generally think - a revamp of LDRSchedule would be a high priority, and would alleviate a lot - of admin headaches. - - 105 LDRSchedule needs to perform collection queries in parallel - 106 LDRSchedule should be verbose on a collection by collection level - - I made the mistake of listing these as bugs. They should be "requests". I made them requests. Thanks for adding these. As far as revamp goes, do you mean a rewrite? Either way, there are a lot of issues with it that I agree having fixed would make lives easier for admins. That's a pretty big priority.
Stuart's email about priorities with Kevin's response:
- Kevin, Brian, Patrick, - - Sorry I have been so slow in getting to this but here is my take - on the prioritization of the current 56 LDR bugs. I did not actually read - the full description of all of them, but hopefully this will be helpful. - I understand that you are working with just the "Priority" field and that - the various states are: - - high: will definitely fix for next release - medium: will try to fix for next release - low: will not fix for next release Correct. - At the top of the high list, i.e., ultra-high, I would put 103, 93, 72, - 73 (if not already fixed in 0.7). If 20 is related to 103 then it also - should be "ultra-high" (or one of them closed as a duplicate).
- 103 Pariallly failed RLS registrations can prevent a complete file transfer.
- 93 LDRMetadataUpate blows away existing metadata.
- 72 LDRMaster crashed with logging to terminal.
- 73 LDRMetadataUpdate crash.
- 20 Publishing needs to be atomic.
- I would also move the following from medium to high: 55, 85 (appears to be - a duplicate of 39 and Ben has a patch for this), plus a new PR for - integrating - Ben's new code to LDRdataFindServer as an option to query an LDAS - diskcacheAPI - hash table for user queries. Non-LDAS sites could use lsync for this once it - is sufficiently tested and re-packaged in an RPM. Thanks, that's the kind of input I was looking for. I will make a PR for the dataFind integration later.. I'm not sure what priority I will make it. I'd like to have it in the next version but I also want to make sure we have time to implement ALL of these changes and test them thoroughly. I expect to have time to do it though.
- 55 LDRSchedule should log the number of LFNs found for each collection.
- Here are a few other thoughts regarding this release: - - While it will not currently help out the Lab installations, you should - probably consider wrapping up a full LDR installation in an RPM to make - it easier for new sites to come on-line if they are going to run on Linux. I agree. I would love to do this. I am not a fan of pacman. - For 35, please contact Dan Kozak who has an alternate SQL query that - he thinks fixes this without any adverse performance side effects. I had no idea... that's great if he does. - I don't see any PR's associated with improved small file support. What - happened to this effort, especially Keith Bayer's Master's project? That seemed to have dropped off the map.. I'm not sure if for good or not. I can't recall the status. I'll figure out if there's anything salvagable from that. - I don't see a PR associated with better support for deleting metadata, - e.g., burst MDC frames, bad GEO frames, incorrect LIGO h(t), ... - Sooner or later I think this is something we need better support for. If you could sometimes create a PR detailing what exactly you'd like, that would help. - I never really understood what the LDR-0.7 python problem is with signals, - but is that going to be addressed only on a case-by-case PR basis, or is - there a way to "just fix this"? It was an issue in Globus/pyGlobus software. It wasn't simple on the surface, meaning I need time to look into it. It should be a fix for all of the daemons, not each one by one. Globus is basically trapping signals and not allowing us to handle them. We have to find the appropriate way to deal with that. - Another "new" idea I would like to consider (and open a PR if it makes - sense) is to add an additional authentication method to - LSCdataFind/LDRdataFindServer. Both Scott and I previously fought and lost - the battle that PFN discovery should be done up-front. However, I am no - longer sure I was on the "right side", and that to better support more - complicated grid models we may actually want to delay PFN discovery - to "just in time" in some cases. At any rate, given a de facto need for this - from the Burst group, I would like to suggest that the datafind protocol be - enhanced to support IP based authentication, i.e., if we have several - hundred client machines hitting datafindserver from a private cluster - network we probably don't want, need or are even able to support several - hundred simultaneous and expensive PKI authentication cycles. - What do you think? This is definitely an interesting idea. I think it's a good one. I will open a PR about it so it stays on the table. I'll have to think about the implications, but offhand it seems ok to me.