View Issue Details

IDProjectCategoryView StatusLast Update
0003714mantisbtupgradepublic2004-08-29 02:35
Reportercbradney Assigned Tothraxisp  
PrioritynormalSeverityfeatureReproducibilityalways
Status closedResolutionfixed 
Product Version0.18.2 
Fixed in Version0.19.0rc1 
Summary0003714: Please add in a way to transfer attachments from the database to disk
Description

When attachments become large, the database grows too much and it would be much easier to store them on disk. Backing up is unwieldy if doing an over the network/internet backup.

Part of my email to mantis-help mailing list:
Do I have to go through every bug and right click and save the files andthen upload to a particular directory? Which directory if so (couldntsee a config line)? and then update the config_inc file to say DISK?Will the existing bugs link to the files on disk then?

Additional Information

Victors mailing list reply:
Please report this in the http://bugs.mantisbt.org. At the moment thereis no easy way to do it. You have to download all attachments, deletethem from the bugs, and re-upload them.In my opinion we can use one of the following solutions:- Have a script that moves all attachments to disk and adjusts thedatabase.- Or add a field to the database that reflects where each attachment isstored. If the admin wants to switch to disk, then all new attachmentswould move to disk but the already existing ones will remain in the database.Regards,Victor.

TagsNo tags attached.
Attached Files
db2disk.diff (4,272 bytes)   
diff -ru mantisbt/CVS/Entries mantisbt_move/CVS/Entries
--- mantisbt/CVS/Entries	Thu Jul 22 07:00:41 2004
+++ mantisbt_move/CVS/Entries	Wed Jul 21 21:17:46 2004
@@ -37,7 +37,6 @@
 /bug_resolve.php/1.41/Sat Jun 26 14:05:42 2004//
 /bug_resolve_page.php/1.41/Sun Jul 11 13:24:29 2004//
 /bug_set_sponsorship.php/1.2/Tue Jul 20 11:11:14 2004//
-/bug_sponsorship_list_view_inc.php/1.10/Wed Jul 21 12:38:36 2004//
 /bug_update.php/1.63/Sat Jun 26 14:05:42 2004//
 /bug_update_advanced_page.php/1.72/Fri Jul 16 23:03:08 2004//
 /bug_update_page.php/1.75/Sun Jul 18 10:22:21 2004//
@@ -162,11 +161,9 @@
 /summary_graph_imp_status.php/1.20/Fri Mar  5 01:26:16 2004//
 /summary_jpgraph_page.php/1.21/Fri Mar  5 01:26:16 2004//
 /summary_page.php/1.39/Tue Jul 20 15:51:50 2004//
-/view.php/1.2/Wed Jul 21 10:23:36 2004//
 /view_all_bug_page.php/1.54/Tue Jul 20 15:51:50 2004//
 /view_all_inc.php/1.141/Wed May 26 05:25:18 2004//
 /view_all_set.php/1.35/Fri Jul  9 00:02:01 2004//
-/view_filters_page.php/1.15/Wed Jul 21 12:48:00 2004//
 D/admin////
 D/core////
 D/css////
@@ -178,3 +175,6 @@
 D/lang////
 D/packages////
 D/sql////
+/bug_sponsorship_list_view_inc.php/1.10/Thu Jul 22 01:17:45 2004//
+/view.php/1.2/Thu Jul 22 01:17:46 2004//
+/view_filters_page.php/1.15/Thu Jul 22 01:17:46 2004//
diff -ru mantisbt/CVS/Root mantisbt_move/CVS/Root
--- mantisbt/CVS/Root	Wed Jul 21 21:49:48 2004
+++ mantisbt_move/CVS/Root	Tue Jul 20 19:47:29 2004
@@ -1 +1 @@
-:ext:thraxisp@cvs.sourceforge.net:/cvsroot/mantisbt
+:pserver:anonymous@cvs.sourceforge.net:/cvsroot/mantisbt
diff -ru mantisbt/admin/CVS/Root mantisbt_move/admin/CVS/Root
--- mantisbt/admin/CVS/Root	Wed Jul 21 21:49:50 2004
+++ mantisbt_move/admin/CVS/Root	Tue Jul 20 19:47:35 2004
@@ -1 +1 @@
-:ext:thraxisp@cvs.sourceforge.net:/cvsroot/mantisbt
diff -ru mantisbt/admin/admin.css mantisbt_move/admin/admin.css
--- mantisbt/admin/admin.css	Tue Jan 14 08:46:58 2003
+++ mantisbt_move/admin/admin.css	Thu Jul 22 11:19:20 2004
@@ -5,4 +5,6 @@
 tr.top-bar { background-color: #98b8e8; padding: 3px; padding: 3px; color: #204888 }
 tr.top-bar td { border-top: 1px solid #222222; border-bottom: 1px solid #222222 }
 tr.top-bar td.title { text-align: right; font-size: 10pt; font-weight: bold; letter-spacing: 1.0em; }
-tr.top-bar td.links { text-align: left; font-size: 8pt }
\ No newline at end of file
+tr.top-bar td.links { text-align: left; font-size: 8pt }
+tr.row-1			{ background-color: #d8d8d8; color: #000000; }
+tr.row-2			{ background-color: #e8e8e8; color: #000000; }
diff -ru mantisbt/admin/upgrade_list.php mantisbt_move/admin/upgrade_list.php
--- mantisbt/admin/upgrade_list.php	Sun Feb 16 03:13:15 2003
+++ mantisbt_move/admin/upgrade_list.php	Tue Jul 20 20:21:23 2004
@@ -20,6 +20,7 @@
 	<h1>List of Upgrade Sets</h1>
 	<p>[ <a href="upgrade.php">Basic upgrade set (required)</a> ]</p>
 	<p>[ <a href="upgrade_escaping.php">String escaping fixes (recommended)</a> ]</p>
+	<p>[ <a href="upgrade_utility.php">System Upgrade Utilities (optional)</a> ]</p>
 	</td></tr></table>
 </div>
 </body>
diff -ru mantisbt/core/file_api.php mantisbt_move/core/file_api.php
--- mantisbt/core/file_api.php	Sat Jul 10 19:38:01 2004
+++ mantisbt_move/core/file_api.php	Thu Jul 22 11:10:49 2004
@@ -415,6 +415,13 @@
 
 		return false;
 	}
+
+	# --------------------
+	# clean file name by removing sensitive characters and replacing them with underscores
+	function file_clean_name( $p_filename ) {
+		return preg_replace( "/[\/\\ :&]/", "_", $p_filename); 
+	}
+
 	# --------------------
 	function file_add( $p_bug_id, $p_tmp_file, $p_file_name, $p_file_type='' ) {
 		$c_bug_id		= db_prepare_int( $p_bug_id );
diff -ru mantisbt/core/php_api.php mantisbt_move/core/php_api.php
--- mantisbt/core/php_api.php	Thu Apr  8 14:04:53 2004
+++ mantisbt_move/core/php_api.php	Thu Jul 22 11:07:16 2004
@@ -77,4 +77,20 @@
 			return key_exists( $key, $search );
 		}
 	}
+
+	# --------------------
+	# file_put_contents is normally in PEAR
+		if (!function_exists('file_put_contents')) {
+		function file_put_contents($filename, $data) {
+			if (($h = fopen($filename, 'w')) === false) {
+				return false;
+			}
+			if (($bytes = @fwrite($h, $data)) === false) {
+				return false;
+			}
+			fclose($h);
+			return $bytes;
+		}
+	}
+
 ?>
db2disk.diff (4,272 bytes)   
file_down.diff (422 bytes)   
Index: file_download.php
===================================================================
RCS file: /cvsroot/mantisbt/mantisbt/file_download.php,v
retrieving revision 1.28
diff -r1.28 file_download.php
78a79,83
> 	$t_download = config_get( 'file_upload_method' );
> 	if ( $v_content <> '' ) {
> 		$t_download = DATABASE;
> 	}
> 	
80c85
< 	switch ( config_get( 'file_upload_method' ) ) {
---
> 	switch ( $t_download ) {
file_down.diff (422 bytes)   
db2disk.tar.gz (8,439 bytes)

Relationships

related to 0004078 closedthraxisp Script to import the value of a custom field to a native field 

Activities

redcom

redcom

2004-04-02 13:29

reporter   ~0005329

A configurable option could be added to specify the size threshold for mantis to decide if the file should be stored in the database or as a regular file.

vboctor

vboctor

2004-07-11 07:03

manager   ~0005988

I don't see the reason for mixing database / disk unless we are keeping the old ones in the database and all the new ones go to disk. However, to make it controlled by a threshold doesn't add value in my opinion.

I would prefer adding a migration script to the admin directory that would move all attachments from database to file system and update the database accordingly. Ideally, the issues last updated will not be affected.

We should also probably consider changing the default value of $g_file_upload_method from DATABASE to DISK, or giving the users a warning somewhere that if they will have lots of attachments or big attachments, that it is not recommended to use DATABASE for attachments.

cbradney

cbradney

2004-07-11 08:39

reporter   ~0005990

No.. no reason to mix db and filesystem storage. The migration script will do fine, although I think it should work both ways (or there be 2 scripts). DEFAULT should be DISK for sure.

vboctor

vboctor

2004-07-13 09:21

manager   ~0006016

thraxisp, good work!! I had a look at the patch and following are my comments:

...db2disk_inc.php

  • In the there is an obsolete comment "These upgrades fix the double escaped data ..."
  • Shouldn't we validate that the attachment directory exists and is writable?
  • Should this script require the user to change the setting for file uploads to use DISK? This will make sure that while the script is running, no more attachments will be added by webusers.
  • You don't add ../ to the beginning of the path if it starts with '/'. What about windows path that start with \.... or C:....
  • I would like to log all the attachments that we attempt moving as follows:
    00001234: Moving 'ccccc.ccc' to '/..../ccccc.ccc' - success.
    00002345: Moving 'aaaaa.aaa' to '/..../aaaaa.aaa' - error: ...
    ...etc.
  • If we manage to move some of the files and some not (for some reason), or the database ends up having a mix, does the file download code handle this? Would be nice if it does. So if the user changes the config to use DISK, everything works fine, even for already existing attachments that were in DATABASE.
  • Ideally we should verify the md5 digest of the file to make sure that we didn't corrupt the files while transferring from db to disk.
  • After copying the files, do you set the content to '' in the database? If not, then you should.
thraxisp

thraxisp

2004-07-13 09:38

reporter   ~0006018

Attached is my first pass at moving data from the database to file system.

There are three files admin/upgrade_list.php (1 line added), admin/upgrade_db2disk.php (new), and admin/upgrades/0_18_move_db2disk_inc.php (new).

It assumed that the system was running properly with the attachments going to the database. Before running the upgrade, you need to ensure that the attachments are switched to disk and operate correctly.

The upgrade process looks like any other database upgrade. It retrieves the attachment from the database and writes it to disk. It then alters the original database attachment record to reflect the new location of the file. Creation dates are left untouched.

The upgrade doesn't work for project documents yet.

Comments are welcome.

thraxisp

thraxisp

2004-07-13 09:52

reporter   ~0006020

Last edited: 2004-07-13 09:53

Thanks. I'll look at the comments individually.

...db2disk_inc.php

- In the there is an obsolete comment "These upgrades fix the double escaped data ..." I will fix this

- Shouldn't we validate that the attachment directory exists and is writable? Should this script require the user to change the setting for file uploads to use DISK? This will make sure that while the script is running, no more attachments will be added by webusers. I will add some more robustness checks later this week.

- You don't add ../ to the beginning of the path if it starts with '/'. What about windows path that start with \.... or C:.... You are correct, I was trying to differentiate absolute paths from relative ones

- I would like to log all the attachments that we attempt moving as follows:
00001234: Moving 'ccccc.ccc' to '/..../ccccc.ccc' - success.
00002345: Moving 'aaaaa.aaa' to '/..../aaaaa.aaa' - error: ...
...etc. I will add this.

- If we manage to move some of the files and some not (for some reason), or the database ends up having a mix, does the file download code handle this? Would be nice if it does. So if the user changes the config to use DISK, everything works fine, even for already existing attachments that were in DATABASE. I will add this to the file_api.php. I will make the 'content' field override the (default) pointer in the database if it is set

- Ideally we should verify the md5 digest of the file to make sure that we didn't corrupt the files while transferring from db to disk. We don't have these digests today, so checking them may be moot. I can see an audit process for this in future. Maybe we can discuss this this evening (EDT) on the IRC

- After copying the files, do you set the content to '' in the database? If not, then you should. I do

edited on: 07-13-04 09:53

cbradney

cbradney

2004-07-13 16:55

reporter   ~0006027

This stuff sounds great and Ill be very happy to test it when I get a chance. Hopefully this week. thanks so much!

thraxisp

thraxisp

2004-07-13 20:44

reporter   ~0006030

The attached files (db2disk2.tar.gz) has updated files that address most of issues raised. I also added migration of project documents to disk.

I don't have access to a Windows server to test this so I'd appreciate if someone would check that it works.

vboctor

vboctor

2004-07-20 05:48

manager   ~0006171

I had a look at the 2nd version and following are my comments:

  • Given that this "move" script is executed separately, it doesn't have to use the upgrade framework. Actually, it is not even an upgrade, it is a utility script. Hence, I would rather have it as a standalone script that just echo's its output rather than the upgrade_string changes that were added in the upgrade script. Also I think this string is not defined in the case of normal upgrades and hence it will generate notices (I think).

  • core/php_api.php is an API file that is implemented for providing compatability with older versions of PHP. Hence, "file_put_contents" should be implemented in there. It is ok to import this file from the admin/ scripts since it doesn't include any other APIs.

  • Can we avoid the need for get_prefix() by changing the current working directory to dirname( dirname( FILE ) ) [which is Mantis directory] and then changing it back to whatever it was after we are done?

  • The current implmentation does not add "upgrade_move_doc2disk" to the upgrade array.

  • The project documents function uses $t_bug_file_table rather than $t_file_table in one of the queries.

  • Try to re-use code between the two methods when possible. For example, the change of a filename to a valid file system filename.

  • I didn't see any modifications to the file download code, does it currently handle the case where there is a mix of DATABASE and DISK?

thraxisp

thraxisp

2004-07-20 14:39

reporter   ~0006179

- Given that this "move" script is executed separately, it doesn't have to use the upgrade framework. Actually, it is not even an upgrade, it is a utility script. Hence, I would rather have it as a standalone script that just echo's its output rather than the upgrade_string changes that were added in the upgrade script. Also I think this string is not defined in the case of normal upgrades and hence it will generate notices (I think).
I was trying not to define yet another way on implementing utilities, and make them look like the upgrades. I'll re-look at this, but it doesn't generate notices for other upgrades.

- core/php_api.php is an API file that is implemented for providing compatability with older versions of PHP. Hence, "file_put_contents" should be implemented in there. It is ok to import this file from the admin/ scripts since it doesn't include any other APIs.
Done.

- Can we avoid the need for get_prefix() by changing the current working directory to dirname( dirname( FILE ) ) [which is Mantis directory] and then changing it back to whatever it was after we are done?
The problem here is tht the script is executed from the admin/ directory, rather than the mantis directory. If the file path is absolute, I don't need to adjust it. If it is relative (e.g. "files/" in the mantisbt/ directory), it needs the prefix so the files go in the right place.

- The current implmentation does not add "upgrade_move_doc2disk" to the upgrade array.

  • The project documents function uses $t_bug_file_table rather than $t_file_table in one of the queries.
    This wasn't well tested.

- Try to re-use code between the two methods when possible. For example, the change of a filename to a valid file system filename.
Will do.

- I didn't see any modifications to the file download code, does it currently handle the case where there is a mix of DATABASE and DISK?
The move won't happen unless "file_upload_method" is set to DISK. All of the existing files are moved, and removed from the schema. Thus you don't need the download code to handle both. I was originally looking at using a temporary file and the existing "file_api" methods, but that complicated the code significantly. The straight move seemed to be more appropriate.

thraxisp

thraxisp

2004-07-22 11:20

reporter   ~0006238

I've updated the files. db2disk.diff lists the diffs for existing files. db2disk.tar.gz has all of the changed files (admin/admin.css, admin/upgrade_utility.php, admin/upgrade_list.php, admin/upgrade_utility.php, core/file_api.php, core/php_api.php).

thraxisp

thraxisp

2004-07-22 12:55

reporter   ~0006241

file_down.diff changes file_download.php to use the schema contents, if set to retrieve the attachment file. This would allow one to change from DATABASE to DISK without copying any data (old attachments left in schema and new ones on disk).

thraxisp

thraxisp

2004-07-23 15:46

reporter   ~0006261

uploaded a new copy of db2disk.tar.gz with the missing move_db2disk.php file.

vboctor

vboctor

2004-07-23 18:17

manager   ~0006274

A modified version of the patch was applied to CVS.